Restaurant Data Analysis using Naïve Bayes: A Data Mining Approach

Verified

Added on 2025/08/30

AI Summary

Desklib provides solved assignments and past papers to help students succeed.

Using Naïve Bayes to predict a event
trend from restaurant data

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Table of Contents
Abstract......................................................................................................................................3
Introduction................................................................................................................................3
Summary of dataset....................................................................................................................4
Data mining techniques..............................................................................................................4
Evaluation and demonstration..................................................................................................13
Conclusion................................................................................................................................16
References................................................................................................................................18
List of figures
Figure 1: opening of file.............................................................................................................4
Figure 2: choose Naive Bayes to apply on data.........................................................................4
Figure 3: start the process..........................................................................................................5
Figure 4: run information of Naive Bayes.................................................................................5
Figure 5: summary of Naive Bayes classifier on dataset...........................................................6
Figure 6: supplied test set...........................................................................................................6
Figure 7: applying test file.........................................................................................................7
Figure 8: output after applying test set.......................................................................................7
Figure 9: result after loading model...........................................................................................7
Figure 10: bagging classifier result............................................................................................8
Figure 11: Adaboost classifier result..........................................................................................8
Figure 12: Adaboost run information.........................................................................................9
Figure 13: one classifier.............................................................................................................9
Figure 14: delete default ZeroR classifier................................................................................10
Figure 15: addition of two or more classifiers.........................................................................10
Figure 16: two classifiers.........................................................................................................11
Figure 17: run information of vote classifier using Naive Bayes and Bayes Net....................11
Figure 18: summary of vote classifier using Naive Bayes and Bayes Net..............................11
Figure 19: output of random forest..........................................................................................14

Figure 20: output of the Naive Bayes classifier.......................................................................15

Abstract
The assignment is based on the classification of the data using the classifiers. This is the
practical which uses the Weka software to run the classifiers on the dataset. The dataset
chosen is based on the chefs of the restaurant. It contains two labels namely placeID and
Rpayment. This set of data is taken from the net provided with the link in brief. To work on
the Weka, the “csv” file is converted into the “arff” format which is accepted by the Weka
software. Different classifiers are run on the dataset to get the information of the data and to
get the summary. It made it able to learn the techniques of machine learning and data mining.
It helps to analyse the data and also to predict the trends of the future. The assignment
contains three parts which helps to learn new different things regarding the data. The
distinction between the different algorithms is described and is explained why the results
differ in some sense. It includes the shreds of evidence that proves that the results are
different using both of the classifiers or the algorithm. Also through the discussion on both
the classifiers, the best one is concluded with the proper justification.
Introduction
The data analytic approach is the process of data examining that can help in drawing the
conclusions regarding the information provided in the dataset. It helps in predicting the trends
of the future using the techniques and technologies. The trend prediction helps the
organization to take the decisions more accurately and precisely. The main aim to perform
the data mining techniques on the dataset is to improve the process of decision making in the
industries. This approach helps in predicting the patterns and trends by the use of predictive
analysis. It is beneficial for the industry as it helps in the increment of the revenues of the
business. The assignment report is based on the same concept of data mining techniques
which takes the dataset into account to get the information about the data in the summarised
form. In this assignment the dataset of the chefs payment methods are used on which
different classifiers are applied to get the results of the data. The practical is divided into
three parts. The first part is about the applying of the classifier on the training dataset using
the cross validation of 10 folds. The second part includes the application of the any classifier
using the test dataset on the training dataset. The last and the third part include the application
of the two or more classifier on the single dataset at the same time.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Summary of dataset
The data set is taken from the data of restaurant which includes the payment methods of the
restaurants along with the ID of the place. The data is voluminous and contains many records.
The dataset contains the label called placeID and Rpayment. The placeID contains the ID of
the restaurant whereas Rpayment contains the entries of the restaurant’s payment options.
The placeID is in the form of integer whereas the Rpayment contains the string values.
Data mining techniques
The learning methods include the classification and the clustering process. They are similar to
some extent with the difference of context. The difference between both is that the
classification uses the technique of supervised learning in which the labels are pre-defined
whereas, the unsupervised learning uses clustering in which the grouping of similar instances
is done on the basis of properties or features. For this problem the supervised mining is used
which means this the clustering problem (Dai, et. al., 2007).
The algorithm adopted for the prediction of the restaurant’s dataset is Naïve Bayes classifier.
This classifier is highly scalable and requires various numbers of linear parameters in the
variables. It categorizes the text with the frequencies of words as the features. It classifies the
models that are assigned by the labels of class to the instances of problem. This algorithm
assumes that the value of feature is independent of the feature’s value (Kim, et, al., 2006).
The process of applying the dataset is as follows:
Figure 1: opening of file
From the option open file, the arff file will get open on which the algorithm of data mining is
to apply.

Figure 2: choose Naive Bayes to apply on data
After the opening of the file, go to the classify option and choose the option the classifier.
Here the classifier chosen is Naïve Bayes with the cross validation of 10 folds and percentage
split is set to 66%.
Figure 3: start the process
After choosing the classifier and setting its option. Start the classification process by clicking
on the start option.

Figure 4: run information of Naive Bayes
This image shows the run information of the dataset according to the Naïve Bayes.
Figure 5: summary of Naive Bayes classifier on dataset
This is the summary of the result of dataset. It shows two results, one is the detailed accuracy
by class and another is the confusion matrix.
Save this classification using model1.model.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Figure 6: supplied test set
After the running of the Naïve Bayes classifier on the original file. Click on the supplied tests
set and load the test file.
Figure 7: applying test file
This shows the loading of the test file. It will revaluate the data using the test data.

Figure 8: output after applying test set
This image shows the output results after applying the data set. The results are different from
the original result of the original data set.
Figure 9: result after loading model
This is the result which shows the outcome of the data set using the classifier but after the
loading of the model that have been saved by the name model1.model.
Figure 10: bagging classifier result

This is the result screen of the dataset after applying the bagging classifier from the family of
Meta classifier.
Figure 11: Adaboost classifier result
This is the result screen of the dataset after applying the Adaboost classifier from the family
of Meta classifier.
Figure 12: Adaboost run information
This is the summary of the result which shows the detailed accuracy by class and confusion
matrix.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Figure 13: one classifier
This image shows that only one classifier is used to get the results. To take more than one
classifier select the classifier and make a double click on it. From this the above mentioned
page will get open. Then select the classifier. After making the click on it, the below
mentioned page will get open.
Figure 14: delete default ZeroR classifier
From this, remove the ZeroR classifier which was selected by default. Click on the delete
option to remove it. Then, make a choice from the listed classifier. This can be done by
clicking on the choose tab.

Figure 15: addition of two or more classifiers
Here, the two selected classifiers are Naïve Bayes and Bayes Net. Then click on Add.
Figure 16: two classifiers
As we can see on the above image that the classifiers is changed from one to two. This means
that now the results will be produced using the 2 classifiers along with the main classifier.