logo

Data Mining and Visualization for Business Intelligence - Assignment 3

   

Added on  2024-05-31

16 Pages2019 Words331 Views
Data Science and Big Data
 | 
 | 
 | 
ITC 516
DATA MINING AND VISUALISATION FOR BUSINESS
INTELLIGENCE
ASSIGNMENT 3
Student Name: Gurwinder Singh
Student ID:
Data Mining and Visualization for Business Intelligence - Assignment 3_1

Table of Contents
INTRODUCTION............................................................................................................................. 3
Task 1: DATA MINING TASK...........................................................................................................4
DECISION TREE........................................................................................................................... 6
NAÏVE BAYES.............................................................................................................................. 8
K-NEAREST NEIGHBOUR...........................................................................................................10
Task 2........................................................................................................................................... 12
k-NN vs Naïve Bayes vs Decision Tree......................................................................................12
Algorithm Performance............................................................................................................13
Conclusion................................................................................................................................... 14
Reference.....................................................................................................................................15
List of table
Table 1: Analysis of k-NN, Naive Bayes and Decision Tree...........................................................12
List of Figures
Figure 1: Weka Index Page.............................................................................................................4
Figure 2: Next Page........................................................................................................................4
Figure 3: soybean.arff file.............................................................................................................. 5
Figure 4: Soybean.arff file loaded in Weka for analysis.................................................................5
Figure 5: Accuracy of the Class using Decision Tree.......................................................................6
Figure 6: Confusion Matrix by decision Tree Analysis....................................................................6
Figure 7: Accuracy of the Class using Naive Bayes.........................................................................8
Figure 8: Confusion Matrix by Naive Bayes Analysis......................................................................9
Figure 9: Accuracy of the Class using KNN Classifier....................................................................10
Figure 10: Confusion Matrix by KNN Classifier Analysis...............................................................11
Data Mining and Visualization for Business Intelligence - Assignment 3_2

INTRODUCTION
Data Mining is a process that uses complicated data to order to gain some insight over that
Data using some complex Data Analytics measures to discover patterns that are known or
unknown. There are various tools that are to be used for the data analysis and processing phase
and can help in creating a better data analytics approach for finding a better understanding of
that data. To implement any Dataset a Knowledge Base is used that knowledge base is going to
help in making a better prediction task and help in analysing the data in a much better way
(Jadhav & Channe, 2016).
The aim of this report is to analyse the Business Requirements for the pattern identification.
This report is going to be focused on the different data mining problems that can help in
comparing the output pattern. In this report, critical analysis has been done for the data set
that is provided to analyse the data that is provided. There are several patterns that have been
analysed by the use of Weka software. Further, the Weka Software is going to provide the
better insight over the dataset.
The Dataset that is used in this analysis report is an ARFF data ARFF stands for Attribute-
Relation File Format that is a file format used by ASCII text files. It includes a list of all the
instances of the attributes. It is used in Weka Software for the Machine Learning Projects. This
report is going to be focused on three data classification algorithms and using the analysis
reports form them finding out which one is best in order to find out the better processing.
Data Mining and Visualization for Business Intelligence - Assignment 3_3

Task 1: DATA MINING TASK
For this Analysis task following processing is done.
1. Run Weka (For this analysis Weka V3.8 is used)
Figure 1: Weka Index Page
2. Click on Open File tab and search for soybean.arff file and open it
Figure 2: Next Page
Data Mining and Visualization for Business Intelligence - Assignment 3_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Text Classification Using Naïve Bayes
|15
|1230
|324

Comparative Exploration of KNN, J48 and Lazy IBK Classifiers in Weka
|19
|2887
|140

Data Mining and Visualization: Performance Comparison of Classification Algorithms
|6
|956
|66

Data Mining and Visualization for Business Intelligence
|14
|1554
|444

Classification Performance Evaluation Tasks 2022
|34
|3548
|50

Assignment on Intelligent Systems for Analytics
|47
|6004
|28