Decision Tree Intuition: From Concept to Application - Analysis Report

Verified

Added on  2022/08/21

|5
|729
|15
Report
AI Summary
This report provides an overview of decision trees, a powerful machine learning algorithm used for data analysis and prediction. It explores the core concepts, including root nodes, splitting, and leaf nodes, and delves into different algorithms like ID3 (Iterative Dichotomiser) which utilizes information gain and entropy, and CART (Classification and Regression Trees) which uses the Gini index. The report discusses the advantages and disadvantages of decision tree models, including the risk of overfitting, and highlights the evolution of more advanced models like random forests and XGBoost. It also mentions the use of tools like scikit-learn and software packages such as numpy, pandas, and scikit-learn for implementing these algorithms. Finally, the report emphasizes the importance of selecting appropriate models for specific problem domains to enhance prediction accuracy.
Document Page
Running head: DECISION TREE INTUITION: FROM CONCEPT TO APPLICATION
DECISION TREE INTUITION: FROM CONCEPT TO APPLICATION
Name of the Student
Name of the University
Author Note
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
1DECISION TREE INTUITION: FROM CONCEPT TO APPLICATION
DECISION TREE INTUITION: FROM CONCEPT TO APPLICATION
Machine Learning has been proved very powerful and popular technology in analyzing
data and predictions based on that. The Article chose for this essay is ‘Decision Tree Intuition:
From Concept to Application’. The reason behind choosing this topic is that it provides a
powerful algorithm in machine learning which creates a model to predict events based on past
information. Decision trees help in decision making using different techniques such as
classification, regression, random forest etc. (Bae 2014). It is believed to be an old method which
have accuracy problems in prediction due to over fitting.
Decision Tree is used in four different types of algorithm named ID3, Reduction in
Variance, CART (Classification and Regression Trees) and Chi-square. Decision Tree uses Root
node, splitting, decision node and leaf nodes. Root node holds the main data which is split using
decision nodes into sub nodes. Lastly the end nodes are called leaf nodes which produces the
outcome. ID3 (Iterative Dichotomiser) uses information gain to split. Entropy equation is used to
calculate the uncertainty in data. Entropy equation is given by h= p ( x ) log p ( x ). Here p(x)
refers to the probability of the event ‘x’ (KDnuggets 2020). First the entropy of the one attribute
is calculated then entropy of contingency table is calculated. Then using the equation of
information gain gives the probabilistic prediction of the events. It uses the tool scikit-learn to
implement the information gain method.
Whereas, CART gini method to split by calculating gini index and gini gain. Suppose if a
wrong player is assigned to a team by selecting randomly then the probability of the wrong
assignment is used to measure feature importance in the tree. The equation of gini index is
Document Page
2DECISION TREE INTUITION: FROM CONCEPT TO APPLICATION
Gini=1
j
p j
2
where P j is the probability of event j. Gini gain is calculated using the difference
of events in parent node and children node (Jiang et al. 2016).
Figure 1: calculating using Event Table
Source: Decision Tree Intuition: From Concept to Application
After calculating the gini gain for the each attribute from the event table, largest gini gain
is chosen which will be the parent node of the tree. Gini value of 0 will be the leaf nodes while
the gini value greater than 0 are split further. The splitting is continuous until the data is
classified. The tools used for classification and regression is based on python software packages
such as numpy, pandas and Scikit Learn. The decision tree models are old and major
disadvantages is the over fitting of trees when the tree is deep. However, other models have been
evolved in present such as random forest an XGBoost which gives higher accuracy for more
complex, deep and dynamic decision trees (Song and Ying 2015). Hence, it can be said that the
decision tree models are capable predicting events based on the information. Using different
models for the different types of trees will provide more accuracy of that particular problem
domain.
Document Page
3DECISION TREE INTUITION: FROM CONCEPT TO APPLICATION
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
4DECISION TREE INTUITION: FROM CONCEPT TO APPLICATION
References
KDnuggets. 2020. Decision Tree Intuition: From Concept To Application - Kdnuggets. [online]
Available at: <https://www.kdnuggets.com/2020/02/decision-tree-intuition.html> [Accessed 17
March 2020].
Bae, J.M., 2014. The clinical decision analysis using decision tree. Epidemiology and health, 36.
Jiang, L., Chen, H., Pinello, L. and Yuan, G.C., 2016. GiniClust: detecting rare cell types from
single-cell gene expression data with Gini index. Genome biology, 17(1), p.144.
Song, Y.Y. and Ying, L.U., 2015. Decision tree methods: applications for classification and
prediction. Shanghai archives of psychiatry, 27(2), p.130.
chevron_up_icon
1 out of 5
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]