Logistic Regression vs Decision Tree: A Comparative Analysis

Verified

Added on  2024/06/21

|17
|904
|199
AI Summary
This document delves into the comparison of two prominent classification algorithms: Logistic Regression and Decision Tree. It explores their working principles, advantages, disadvantages, and key differences. Logistic Regression, a linear model, excels in predicting binary outcomes, while Decision Tree, a non-linear model, provides a visual representation of the decision-making process. The document also discusses the concept of Maximum Likelihood Estimation (MLE) and Ordinary Least Squares (OLS) in the context of these algorithms. It further examines the role of attribute selection measures like Information Gain, Gain Ratio, and Gini Index in Decision Tree construction.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
INTRODUCTION
TEXT MINING – WIDELY USED PROCESS OF DATA MINING
LOGISTIC REGRESSION – PROFFERED LINEAR DATA
DECISION TREE – COUNTERPART OF LOGISTIC REGRESSION
DECISION TREE – POSSIBLE TO VISUALIZE THE OUTPUT AND
LOGIC BEHIND IT
LOGISTIC – FIXED OUTPUT AND DECISION – CONTINUOUS
OUTPUT
Document Page
LOGISTIC REGRESSION
USED TO PREDICT BINARY CLASS
SPECIAL TYPE OF LINEAR REGRESSION -
TARGETED VARIABLE NATURE IS
CATEGORICAL
OUTPUT IS – DICHOTOMOUS IN NATURE
USE LOGIT FUNCTION
LOGISTIC REGRESSION PROPERTY
o BERNOULLI DISTRIBUTION
o MLE (MAXIMUM LIKELIHOOD ESTIMATION)
o CONCORDANCE, KS-STATISTICS
EQUATION OF LINEAR REGRESSION -
Y = Β0 + Β1X1 + Β2X2 + B3X3
………….ΒNXN
SIGMOID FUNCTION –
AFTER APPLYING SIGMOID FUNCTION
ON LINEAR REGRESSION IT BECOME
LOGISTIC REGRESSION
Document Page
SIGMOID FUNCTION
S” SHAPE CURVE
MAP INPUTTED VALUE FROM 0 TO 1
POSITIVE INFINITY – 1 OR YES OR
TRUE
NEGATIVE INFINITY – 0 OR NO OR
FALSE
VALUE < 0.5 0
VALUE >= 0.5 1

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
TYPES OF LOGISTIC REGRESSION
1. BINARY LOGISTIC REGRESSION –
2. MULTINOMIAL LOGISTIC REGRESSION –
3. ORDINAL LOGISTIC REGRESSION –
Document Page
LINEAR REGRESSION VS LOGISTIC
REGRESSION
BOTH ARE REGRESSION ALGORITHM
LINEAR REGRESSION - GET CONTINUOUS OUTPUT
LOGISTIC REGRESSION – GET FIXED OUTPUT
CONTINUOUS OUTPUT – CAR / HOUSE PRICE OR
STOCK INDEX ETC.
FIXED OUTPUT – CUSTOMER IS INTERESTED IN
PRODUCT OR NOT, EMAIL IS SPAM OR NOT, TYPE
OF WINE AND RATING OF RESTAURANT FROM 1
TO 5 TETC.
LINEAR REGRESSION USE OLS APPROACH WHILE
LOGISTIC REGRESSION USE MLE APPROACH
Document Page
MLE VS OLS
MLE
MLE - MAXIMUM LIKELIHOOD
ESTIMATION
MLE STATISTIC –MEAN AND VARIANCE
ARE PARAMETER TO FIND THE
SPECIFIC VALUE OF GIVEN DATA SET
LOGISTIC REGRESSION
ASSUMPTION – JOINT PROBABILITY
MASS FUNCTION
OLS
OLS - ORDINARY LEAST SQUARES
LINEAR REGRESSION
WORK BY KEEPING SUM OF SQUARED
DEVIATION (LEAST SQUARE ERROR)
MINIMUM
DOES NOT REQUIRE ANY
ASSUMPTION

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
ADVANTAGES OF LOGISTIC REGRESSION
SIMPLEST CLASSIFICATION ALGORITHM.
EASY IMPLEMENT
DO NOT REQUIRE HIGH TRAINING EFFICIENCY AND HIGH COMPUTATION
POWER.
CAN BE USED TO FIND OUT THE RELATIONSHIP BETWEEN THE ATTRIBUTE
OF A DATA SET.
CAN WORK WELL WITH BOTH SCALED AND NON-SCALED FEATURES.
VERY EFFECTIVE WHEN FEATURES ARE LINEARLY SCALE ABLE.
MODEL TRAINING TIME IS FAR LESS COMPARATIVELY WITH OTHER
ALGORITHM
Document Page
DIS-ADVANTAGES OF LOGISTIC
REGRESSION
ASSUMPTION OF LINEARITY BETWEEN DEPENDENT AND IN-DEPENDENT VARIABLES
NOT AS EFFECTIVE IF DATA IS NON-LINEAR WHICH IS ALMOST NOT POSSIBLE IN REAL
WORLD.
HIGH RELY ON PROPER PRESENTATION OF DATA.
DO NOT PERFORM WELL WHEN DATA SET CONTAIN OUTLIERS BECAUSE OF ITS
SENSITIVITY TO OUTLIERS.
DO NOT PREDICT CORRECT RESULT FOR SMALL DATA SET WITH HIGH DIMENSIONS.
CAN BE EASILY OUTPERFORMED BY OTHER ALGORITHM.
PERFORMANCE IS POOR WITH HIGH CO-RELATED FEATURES AND IRRELEVANT FEATURES.
Document Page
DECISION TREE ALGORITHM
LOOK LIKE THE TREE OF
FLOWCHART
SPLITTING - RECURSIVE
PARTITIONING MANNER
WHITE BOX” ALGORITHM
NOT DEPEND - PROBABILITY
DISTRIBUTION ASSUMPTION

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
WORKING OF DECISION TREE ALGORITHM
1. BEST ATTRIBUTE - ASM (ATTRIBUTE
SELECTION MEASURES)
2. MAKE THIS ROOT NODE
3. REPEAT ABOVE PROCESS IN RECURSIVELY
MANNER
CONTINUES THE PROCESS TILL
ALL THE TUPLES HAVE SAME ATTRIBUTE VALUE.
NO ATTRIBUTE REMAINS FOR FURTHER
SPLITTING
INSTANCES ARE NOT AVAILABLE FOR FURTHER
SPLITTING
Document Page
RECURSIVE BINARY SPLITTING
CREATE DIFFERENT DECISION
TREE BY CONSIDERING ALL
ATTRIBUTE AS ROOT NODE
SELECT TREE WHOSE COST
FUNCTION IS LOWEST
GREEDY ALGORITHM” -
ALWAYS WANT TO KEEP COST
FUNCTION LOW
COST FUNCTION -
ATTRIBUTE SELECTION
MEASURES
THREE MAIN ARE
1. INFORMATION GAIN
2. GAIN RATIO
3. GINI INDEX
Document Page
INFORMATION GAIN
CONCEPT OF ENTROPY
.ENTROPY - IMPURITY OF DATA SET
INFORMATION GAIN DECREASES
ENTROPY
INFORMATION GAIN = ENTROPY
BEFORE SPLITTING - AVERAGE
ENTROPY AFTER SPLITTING
PREFER HIGH INFORMATION GAIN
VALUE TO SPLIT FIRST

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
GAIN RATIO
EXTENSION OF INFORMATION
GAIN
USE C4.5 ALGORITHM IN PLACE
OF ID3 ALGORITHM
C4.5 IS ALSO KNOWN AS J48
DEVELOPED BY WEKA TOOL
DEVELOPMENT TEAM
HIGHEST GAIN RATIO SPLIT FIRST
Document Page
GINI INDEX
MINIMUM GINI INDEX ATTRIBUTE SPLIT FIRST
CONSIDER AS BINARY SPLIT
Document Page
CONCLUSION
LOGISTIC REGRESSION – MOST PREFERABLE ALGORITHM
DECISION TREE – COUNTERPART OF LOGISTIC REGRESSION
DECISION TREE – OUTPUT IS VISUALIZE WITH LOGIC OF SPLIT
E-MAIL SPAM, PRODUCT RATING AND WINE TYPE
OUTPUT OF LOGISTIC REGRESSION IS FIXED TYPE
INFORMATION GAIN, GAIN RATIO AND GINI INDEX

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
REFERENCE
NAVLANI, A., 2019. (TUTORIAL) UNDERSTANDING LOGISTIC REGRESSION IN PYTHON. [ONLINE]
DATACAMP COMMUNITY. AVAILABLE AT:
<HTTPS://WWW.DATACAMP.COM/COMMUNITY/TUTORIALS/UNDERSTANDING-LOGISTIC-
REGRESSION-PYTHON> [ACCESSED 20 SEPTEMBER 2020].
NAVLANI, A., 2018. DECISION TREE CLASSIFICATION IN PYTHON. [ONLINE] DATACAMP
COMMUNITY. AVAILABLE AT: <HTTPS://WWW.DATACAMP.COM/COMMUNITY/TUTORIALS/DECISION-
TREE-CLASSIFICATION-PYTHON> [ACCESSED 20 SEPTEMBER 2020].
GROVER, K., N.D. ADVANTAGES AND DISADVANTAGES OF LOGISTIC REGRESSION. [ONLINE]
OPENGENUS IQ: LEARN COMPUTER SCIENCE. AVAILABLE AT:
<HTTPS://IQ.OPENGENUS.ORG/ADVANTAGES-AND-DISADVANTAGES-OF-LOGISTIC-
REGRESSION/> [ACCESSED 20 SEPTEMBER 2020].
1 out of 17
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]