(PDF) SVM Classification with Linear and RBF kernels
Verified
Added on 2021/05/27
|5
|1826
|79
AI Summary
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Practical 9 - Predictive Modelling Answer each of the questions below using the examples and code provided in working python file: 1. How many features are there for theiris dataset? How many examples? How many labels? There are four features in the iris dataset. These features are measured in centimetres. The features are: 1.Sepal length 2.Sepal width 3.Petal length 4.Petal width Each column is a feature (also known as:Predictor, attribute, Independent Variable, input, regressor, Covariate) There are 50 samples for each specie(Iris Setosa,Iris virginicaandIris versicolor)of Iris flower. This results in 150 records (examples) where each observation will have 4 features, as stated above.Each row is an observation(also known as: sample, example, instance, record) Labels are also known as targets. Each value that we predict is the response (also known as: target, outcome, label, dependent variable. Classification is a supervised learning where label is categorical. There are 150 labels in iris dataset falling under 3 categories: 0= Setosa 1= Versicolor 2= Virginica 2. Why is it important to split the dataset into training and test set? Why a classification model needs to be trained on the training set and the prediction performance needs to be measured on the test set? In Machine Learning, we make a model which is nothing but an algorithm where some parameters needs to be modified such that it is able to perform good at the application i.e. it is able to predict values of one wants to. We can train the model using data which we call as training data or training set. The training data is the one which already has the actual value that the model should have predicted and thus the algorithm changes the value of parameters to account for the data in the training set. To know after training the model is overall good or not, we have test data/test set which is basically a different data for which we know the values but this data was never shown to the model before. Thus, if the model after training is performing good on test set as well then, we can say that the Machine Learning model is good.
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
It is important to learn the predictive model (i.e. the classifier) on the training set and test its performance on the test set. The purpose of predictive modelling is to create models that are able to predict on future data. Hence it is important to keep training and test data separate and do not use test data for learning predictive models. A classification model can be used to predict the class label of unknown records. A classification technique is a systematic approach to building classification models from an input set. The model generated by a learning algorithm should both fit the input data well and correctly predict the class labels of records it has never seen before. First,a training setconsisting of records whose class labels are known must be provided. The training set is used to build a classification model, which is subsequently applied to thetest set, which consists of records with unknown class labels. Evaluation of the performance of a classification model is based on the counts of test records correctly and incorrectly predicted by the model. 3.How correlation analysis can help identify the best features for the classification task? What are the best features for the iris data based on correlation analysis results? Data correlation is the way in which one set of data may correspond to another set. For the classification problem, feature selection aims to select subset of highly discriminant features. In other words, it selects features that are capable of discriminating samples that belong to different classes. For the problem of feature selection for classification, due to the availability of label information, the relevance of features is assessed as the capability of distinguishing classes. For example,a feature fi is said to be relevant to a class cj if fi and cj are highly correlated. Classification is the problem of identifying to which of a set of categories a new observation belongs, on the basis of a training set of data containing observations whose category membership is known. Based on the correlation analysis results, we can see that featurespetal_lengthandpetal_widthare the best features for iris classification. As per the pair plot graphs, petal_length and petal_width is highly correlated. If you try to train a model on a set of features with no or very little correlation, it will give inaccurate results. 4. Which class is easier to identify than the other two classes for the iris dataset? How can you tell it? As per the Correlation analysis results, the class Setosa with target value 0 is easier to identify than the other two classes (1-Versiocolor, 2-Virginica) for the iris dataset. As evident in the plotted graph, Setosa (represented by blue color) is easily separable and can be distinguished by the other two classes of species of iris dataset. Setosa is easy to classify and has an easily separable boundary around it and helps to eliminate it from the other two classes.
5. Which classification model produces better test result for the iris data? Linear SVM trained on all features or linear SVM trained on the two best features? What does this tell you? As per the classifier test performance, we see that linear SVM helps the classification results by visually plotting the decision boundaries. Different colored regions correspond to different classes. As per the linear SVM classifier test performance, linear SVM trained on all features has better result compared to linear SVM trained on the two best features for the iris dataset because the test accuracy has gone up to 95% compared to initial level of 85%. 6. Why linear SVM does not produce good result for the two moons example? Linear SVM does not produce good result for the two moons example because this is a binary classification problem and the targets from this dataset will not be well separated with a linear classifier. 7. Compare linear and kernel SVM in terms of predictive performance and training speed. What conclusions can you make? a.Kernel SVM achieves better performance in terms of higher accuracy than linear SVM. Accuracy of Linear SVM = 86.0 % Accuracy of Kernel SVM = 93.4 % b.Kernel SVM produces a nonlinear decision boundary (a curve) to separate points from two classes, showing different regions in different colors while Linear SVM produces a linear decision boundary (a line) to separate points from two classes, which is not appropriate for this case. c.Though kernel SVM is effective yet it is slower than linear SVM in training. When we measured the average time by training both linear and kernel SVM classifiers 3 * 100 times, the results are as follows: Linear SVM: 100 loops, best of 3: 11 ms per loop Kernel SVM: 100 loops, best of 3: 22.8 ms per loop We can conclude that kernel SVM is a good classifier in terms of predictive performance while linear SVM is better classifier in terms of training speed. 8. Why do we need to perform parameter selection in training classification models for predictive modelling? We need to perform parameter selection in training classification models for predictive modelling because it helps in further improving the performance of the model, particularly training performance.
Based on the test results, we can see that both training and testing performances are affected by the choice of parameter. For Example, we have used the regularisation parameter C to see the effect on performance. Increasing value of C shows improvement in training performance but not in testing performance due to overfitting of the model. 9. Why can't we choose the classifier parameter that produces the best training performance? We can’t choose the classifier parameter that produces the best training performance because maximizing training accuracy rewards overly complex models which overfit the training data. There is an effective approach called cross validation for parameter selection on the training dataset. 10. What is cross validation and why it is an effective technique for parameter selection in classifier training? Cross Validation is used to assessthe predictive performance of the models and to judge how they perform outside the sample to a new data set also known as test data. The motivation to use cross validation techniques is that when we fit a model, we are fitting itto a training dataset. Without cross validation we only have information on how our model performs to our in-sample data. Ideally, we would like to see how the model performs when we have a new data in terms of accuracy of its predictions. In science, theories are judged by its predictive performance. k-fold cross-validation is mostly suggested in machine learning. Cross validation is an effective technique for parameter selection in classifier training because it uses data more efficiently as every observation is used for both training and testing and it provides more accurate estimate of out-of-sample accuracy. In our example, we can see the accuracy value achieved by kernel SVM classifier trained with optimal parameter is higher than that produced with kernel SVM classifier trained using default parameter value. This validates the importance and effectiveness of parameter selection.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
References: Jason Brownlee(2016)Your First Machine Learning Project in Python Step-By-Step. Tate [online]. Available from:https://machinelearningmastery.com/machine-learning-in-python-step-by-step/ [Accessed 21 May 2018]. Karlijn Willems (2017) Python Exploratory Data Analysis Tutorial.Tate [online]. Available from:https://www.datacamp.com/community/tutorials/exploratory-data-analysis-python [Accessed 21 May 2018]. Roberto Lopez (2018) Iris flowers classification.Tate [online].Available from:https://www.neuraldesigner.com/learning/examples/iris_flowers_classification[Accessed 21 May 2018]. Ritchie Ng (2018). Cross-Validation.Tate [online]. Available from:http://www.ritchieng.com/machine-learning-cross-validation/ [Accessed 21 May 2018].