LOGISTIC REGRESSION AND APPLICATION ON STATA Logistic Regression and Application on STATA Author Note: 10 1 2 LOGISTIC REGRESSION AND APPLICATION ON STATA
VerifiedAdded on 2023/04/24
|12
|1955
|189
AI Summary
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Running head: LOGISTIC REGRESSION AND APPLICATION ON STATA
Logistic Regression and Application on STATA
Name of the Student:
Name of the University:
Author Note:
Logistic Regression and Application on STATA
Name of the Student:
Name of the University:
Author Note:
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
1LOGISTIC REGRESSION AND APPLICATION ON STATA
Table of Contents
Answer 1.a..............................................................................................................................2
Answer 1.b.............................................................................................................................2
Answer 2.a..............................................................................................................................3
Answer 2.b.............................................................................................................................4
Answer 2.c..............................................................................................................................5
Answer 3.a..............................................................................................................................5
Answer 3.b.............................................................................................................................6
Answer 3.c.I...........................................................................................................................7
Answer 3.c.II..........................................................................................................................7
Answer 3.c.III.........................................................................................................................8
Answer 3.d.............................................................................................................................8
Answer 3.e..............................................................................................................................9
Reference:................................................................................................................................10
Table of Contents
Answer 1.a..............................................................................................................................2
Answer 1.b.............................................................................................................................2
Answer 2.a..............................................................................................................................3
Answer 2.b.............................................................................................................................4
Answer 2.c..............................................................................................................................5
Answer 3.a..............................................................................................................................5
Answer 3.b.............................................................................................................................6
Answer 3.c.I...........................................................................................................................7
Answer 3.c.II..........................................................................................................................7
Answer 3.c.III.........................................................................................................................8
Answer 3.d.............................................................................................................................8
Answer 3.e..............................................................................................................................9
Reference:................................................................................................................................10
2LOGISTIC REGRESSION AND APPLICATION ON STATA
Answer 1.a.
Age
group
Observed proportions
of CHD diagnoses, ^π
Proportion free
of CHD, 1− ^π
Estimated
Odds
Estimated
Odds Ratio
Estimated
Relative Risk
35-40 0.057 0.943 0.060 1.00* 1.00*
41-45 0.05 0.95 0.053 0.871 0.877
46-50 0.093 0.907 0.103 1.696 1.632
51-55 0.123 0.877 0.140 2.320 2.158
56-60 0.149 0.851 0.175 2.897 2.614
Table 1: Age Group Wise CHD Diagnoses
Answer 1.b.
Odds are usually used to express absolute chance of an event’s occurrence. For the
age group 51-55, 0.140 is the estimated odds. It means the chance of male subjects getting
diagnosed with CHD of this age group is 14%.
Odds ratio is an important statistical measure of the relative chance of an event’s
occurrence under two different conditions. For the age group 51-55, estimated odds ratio
relative to the age group 35-40 is 2.320. It means the chance of male subjects getting
diagnosed with CHD of age group 51-55 is 2.32 times more than male subjects of age group
35-40.
Relative risk is presented as a ratio of two probabilities of an event’s occurrence in
two groups. The relative risk of the age group 51-55 is 2.158. The risk of male subjects
getting diagnosed with CHD of age group 51-55 is 2.158 times more than the male subjects
of age group 35-40 (Norton, Miller & Kleinman, 2013).
Answer 1.a.
Age
group
Observed proportions
of CHD diagnoses, ^π
Proportion free
of CHD, 1− ^π
Estimated
Odds
Estimated
Odds Ratio
Estimated
Relative Risk
35-40 0.057 0.943 0.060 1.00* 1.00*
41-45 0.05 0.95 0.053 0.871 0.877
46-50 0.093 0.907 0.103 1.696 1.632
51-55 0.123 0.877 0.140 2.320 2.158
56-60 0.149 0.851 0.175 2.897 2.614
Table 1: Age Group Wise CHD Diagnoses
Answer 1.b.
Odds are usually used to express absolute chance of an event’s occurrence. For the
age group 51-55, 0.140 is the estimated odds. It means the chance of male subjects getting
diagnosed with CHD of this age group is 14%.
Odds ratio is an important statistical measure of the relative chance of an event’s
occurrence under two different conditions. For the age group 51-55, estimated odds ratio
relative to the age group 35-40 is 2.320. It means the chance of male subjects getting
diagnosed with CHD of age group 51-55 is 2.32 times more than male subjects of age group
35-40.
Relative risk is presented as a ratio of two probabilities of an event’s occurrence in
two groups. The relative risk of the age group 51-55 is 2.158. The risk of male subjects
getting diagnosed with CHD of age group 51-55 is 2.158 times more than the male subjects
of age group 35-40 (Norton, Miller & Kleinman, 2013).
3LOGISTIC REGRESSION AND APPLICATION ON STATA
Answer 2.a
0 .2 .4 .6 .8 1
Mortal Status at 30 Days
0 10 20 30 40
APACHE II Score at Baseline
Figure. 1: Scatter Diagram of Mortal Status against APACHE Score
0 .2 .4 .6 .8 1
Mortal Status at 30 Days
1 1.5 2 2.5 3
Group Apache
Answer 2.a
0 .2 .4 .6 .8 1
Mortal Status at 30 Days
0 10 20 30 40
APACHE II Score at Baseline
Figure. 1: Scatter Diagram of Mortal Status against APACHE Score
0 .2 .4 .6 .8 1
Mortal Status at 30 Days
1 1.5 2 2.5 3
Group Apache
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
4LOGISTIC REGRESSION AND APPLICATION ON STATA
Fig. 2: Scatter Diagram of Mortal Status against Group APACHE Score
From the above two diagrams figure 1 and figure 2, it is observed that if the apache
score is less than 17 then the patient will be alive at 30 day post admission to intensive care
unit for sepsis patients and if the score is greater than 27 then the sepsis patient will be dead
at 30 day post admission to intensive care.
Answer 2.b
_cons .0129352 .0177422 -3.17 0.002 .0008795 .1902389
apache 1.222914 .0744759 3.30 0.001 1.085319 1.377953
fate Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
Log likelihood = -14.956085 Pseudo R2 = 0.4276
Prob > chi2 = 0.0000
LR chi2(1) = 22.35
Logistic regression Number of obs = 38
. logistic fate apache
T
able.2.b: Logistic Regression effect of APACHE Score on FATE
From the table above, it is clear that the odds ratio, corresponding to APACHE is
statistically significant as p=0.001. This indicates the presence of an effect of APACHE on
fate. It means that one unit increase in apache will increase odds ratio of fate by 1.223.
Moreover, if the APACHE score is 0 then the odds ratio of fate will be 0.0129 as the odds
ratio for the constant term is 0.129 with p=0.002(<0.05).
Answer 2.c
There is a slight difference in odds ratio and relative risk ratio. Relative risk ratio is
the ratio of probabilities. The estimated risk of dying within 30-days of admission for a
patient with APACHE score 17 is 1.009.
Fig. 2: Scatter Diagram of Mortal Status against Group APACHE Score
From the above two diagrams figure 1 and figure 2, it is observed that if the apache
score is less than 17 then the patient will be alive at 30 day post admission to intensive care
unit for sepsis patients and if the score is greater than 27 then the sepsis patient will be dead
at 30 day post admission to intensive care.
Answer 2.b
_cons .0129352 .0177422 -3.17 0.002 .0008795 .1902389
apache 1.222914 .0744759 3.30 0.001 1.085319 1.377953
fate Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
Log likelihood = -14.956085 Pseudo R2 = 0.4276
Prob > chi2 = 0.0000
LR chi2(1) = 22.35
Logistic regression Number of obs = 38
. logistic fate apache
T
able.2.b: Logistic Regression effect of APACHE Score on FATE
From the table above, it is clear that the odds ratio, corresponding to APACHE is
statistically significant as p=0.001. This indicates the presence of an effect of APACHE on
fate. It means that one unit increase in apache will increase odds ratio of fate by 1.223.
Moreover, if the APACHE score is 0 then the odds ratio of fate will be 0.0129 as the odds
ratio for the constant term is 0.129 with p=0.002(<0.05).
Answer 2.c
There is a slight difference in odds ratio and relative risk ratio. Relative risk ratio is
the ratio of probabilities. The estimated risk of dying within 30-days of admission for a
patient with APACHE score 17 is 1.009.
5LOGISTIC REGRESSION AND APPLICATION ON STATA
Answer 3.a
Total 1,818 3,182 5,000
2 1,168 1,620 2,788
1 650 1,562 2,212
Gender 0 1 Total
sleep
at least 7 hours of
Table 3.a.1: 2x2 table of Gender and Response
The above table presents a 2x2 table of gender and response.
chi2(1) = 83.40 Pr>chi2 = 0.0000
Prev. frac. pop .1771393
Prev. frac. ex. .422829 .3490885 .4882288 (exact)
Odds ratio .577171 .5117712 .6509115 (exact)
Point estimate [95% Conf. Interval]
Total 1818 3182 5000 0.3636
Controls 1168 1620 2788 0.4189
Cases 650 1562 2212 0.2939
Exposed Unexposed Total Exposed
Table 3.a.2: Crude Odds Ratio of Getting 7 Hours of Sleep
Table 3.a.2 shows the calculation of odds ratio of getting 7 hours of sleep comparing
males to females. The odds ratio is 0.577 which means the chance of not getting 7 hours of
sleep comparing males to females is 0.577 times less than the chance of getting 7 hours of
sleep.
Answer 3.a
Total 1,818 3,182 5,000
2 1,168 1,620 2,788
1 650 1,562 2,212
Gender 0 1 Total
sleep
at least 7 hours of
Table 3.a.1: 2x2 table of Gender and Response
The above table presents a 2x2 table of gender and response.
chi2(1) = 83.40 Pr>chi2 = 0.0000
Prev. frac. pop .1771393
Prev. frac. ex. .422829 .3490885 .4882288 (exact)
Odds ratio .577171 .5117712 .6509115 (exact)
Point estimate [95% Conf. Interval]
Total 1818 3182 5000 0.3636
Controls 1168 1620 2788 0.4189
Cases 650 1562 2212 0.2939
Exposed Unexposed Total Exposed
Table 3.a.2: Crude Odds Ratio of Getting 7 Hours of Sleep
Table 3.a.2 shows the calculation of odds ratio of getting 7 hours of sleep comparing
males to females. The odds ratio is 0.577 which means the chance of not getting 7 hours of
sleep comparing males to females is 0.577 times less than the chance of getting 7 hours of
sleep.
6LOGISTIC REGRESSION AND APPLICATION ON STATA
Answer 3.b
_cons 4.163544 .4202523 14.13 0.000 3.416222 5.074348
gender .577171 .03488 -9.09 0.000 .5127009 .649748
atleast7hoursofsleep Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
Log likelihood = -3235.2009 Pseudo R2 = 0.0128
Prob > chi2 = 0.0000
LR chi2(1) = 84.21
Logistic regression Number of obs = 5000
Tables 3.b: Output Table of Logistic Regression Showing Odds Ratio with 95% confidence
interval
Answer 3.c.I
The following regression model cannot be taken as the dependent variable is binary:
Y = β0 + β1 X +u Where X represents gender and Y is the binary variable which represents
whether students faculty is getting 7 hours of sleep or not.
So, the interpretation of expectation of Y given X can be expressed as conditional probability,
that is pr (Y =1|X ). This implies, E ( Y =1| X ) is the probability of Y=1 given X.
E ( Y =1| X ) =β0 + β1 X= p; AssumingE ( u )=0.
p=E ( Y =1|X ) = 1
1+e−(β0 +β 1 X ) ; This equation is for getting 7 hours of sleep.
p= e(β 0+ β1 X )
1+e( β0+ β1 X ) This equation represents the logistic regression function.
1− p= 1
1+ e(β0 +β 1 X ) This equation is for not getting 7 hours of sleep.
So the below equation can be formed:
Answer 3.b
_cons 4.163544 .4202523 14.13 0.000 3.416222 5.074348
gender .577171 .03488 -9.09 0.000 .5127009 .649748
atleast7hoursofsleep Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
Log likelihood = -3235.2009 Pseudo R2 = 0.0128
Prob > chi2 = 0.0000
LR chi2(1) = 84.21
Logistic regression Number of obs = 5000
Tables 3.b: Output Table of Logistic Regression Showing Odds Ratio with 95% confidence
interval
Answer 3.c.I
The following regression model cannot be taken as the dependent variable is binary:
Y = β0 + β1 X +u Where X represents gender and Y is the binary variable which represents
whether students faculty is getting 7 hours of sleep or not.
So, the interpretation of expectation of Y given X can be expressed as conditional probability,
that is pr (Y =1|X ). This implies, E ( Y =1| X ) is the probability of Y=1 given X.
E ( Y =1| X ) =β0 + β1 X= p; AssumingE ( u )=0.
p=E ( Y =1|X ) = 1
1+e−(β0 +β 1 X ) ; This equation is for getting 7 hours of sleep.
p= e(β 0+ β1 X )
1+e( β0+ β1 X ) This equation represents the logistic regression function.
1− p= 1
1+ e(β0 +β 1 X ) This equation is for not getting 7 hours of sleep.
So the below equation can be formed:
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
7LOGISTIC REGRESSION AND APPLICATION ON STATA
p
1− p = e(β 0+ β1 X )∗( 1+ e(β0 + β1 X ))
1+e(β 0+ β1 X )
p
1− p =e( β0+ β1 X )
p
1− p is simply the odds ratio in favour of getting 7 hours of sleep.
β0∧β1 are the parameters of the model. Here the predictor is gender and
corresponding coefficients is β1along with a constant term β0.
Answer 3.c.II
a) Taking log of the odds ratio equation it can be presented as below,
ln ( p
1− p )=β0 + β1 X
The above equation is the equation for log odds of “getting 7 hours of sleep”.
b) The equation for odds of “getting 7 hours of sleep” is,
p
1− p =e( β0+ β1 X )
Answer 3.c.III
_cons .3191769 .0474125 -7.69 0.000 .2385551 .4270455
faculty 1.850023 .0464527 24.50 0.000 1.761182 1.943347
gender .9639816 .0656835 -0.54 0.590 .8434703 1.101711
atleast7hoursofsleep Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
Log likelihood = -2894.3388 Pseudo R2 = 0.1169
Prob > chi2 = 0.0000
LR chi2(2) = 765.94
Logistic regression Number of obs = 5000
Table 3.c.III: Logistic Regression Including the Faculty
p
1− p = e(β 0+ β1 X )∗( 1+ e(β0 + β1 X ))
1+e(β 0+ β1 X )
p
1− p =e( β0+ β1 X )
p
1− p is simply the odds ratio in favour of getting 7 hours of sleep.
β0∧β1 are the parameters of the model. Here the predictor is gender and
corresponding coefficients is β1along with a constant term β0.
Answer 3.c.II
a) Taking log of the odds ratio equation it can be presented as below,
ln ( p
1− p )=β0 + β1 X
The above equation is the equation for log odds of “getting 7 hours of sleep”.
b) The equation for odds of “getting 7 hours of sleep” is,
p
1− p =e( β0+ β1 X )
Answer 3.c.III
_cons .3191769 .0474125 -7.69 0.000 .2385551 .4270455
faculty 1.850023 .0464527 24.50 0.000 1.761182 1.943347
gender .9639816 .0656835 -0.54 0.590 .8434703 1.101711
atleast7hoursofsleep Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
Log likelihood = -2894.3388 Pseudo R2 = 0.1169
Prob > chi2 = 0.0000
LR chi2(2) = 765.94
Logistic regression Number of obs = 5000
Table 3.c.III: Logistic Regression Including the Faculty
8LOGISTIC REGRESSION AND APPLICATION ON STATA
From the above table, it is observed that the variable gender with p=0.590 is not
statistically significant. The other variable faculty and the intercept term is statistically
significant as the p-value for both is less than 0.05. The odds ratio for the faculty is 1.85
which means one unit increase in the value of faculty will increase odds ratio in favour
getting 7 hours of sleep by 1.223.
In comparison to the question 3.b, the variable gender is not significant. There will be
no effect on odds ratio of “getting 7 hours of sleep” due to change in gender. In the answer of
3.b, intercept term has a significant large effect on the odds ratio. The actual effect was due to
the group variable faculty which was captured in that intercept term (Liu, Li & Liang, 2014).
From the above table, it is observed that the variable gender with p=0.590 is not
statistically significant. The other variable faculty and the intercept term is statistically
significant as the p-value for both is less than 0.05. The odds ratio for the faculty is 1.85
which means one unit increase in the value of faculty will increase odds ratio in favour
getting 7 hours of sleep by 1.223.
In comparison to the question 3.b, the variable gender is not significant. There will be
no effect on odds ratio of “getting 7 hours of sleep” due to change in gender. In the answer of
3.b, intercept term has a significant large effect on the odds ratio. The actual effect was due to
the group variable faculty which was captured in that intercept term (Liu, Li & Liang, 2014).
9LOGISTIC REGRESSION AND APPLICATION ON STATA
Answer 3.d
Faculty N Observed k Observed p
1 1182 437 0.369
2 578 221 0.382
3 984 638 0.648
4 1544 1371 0.888
5 712 515 0.723
Table.3.d: Group Wise Probability
In the faculty group 1 & 2, probability of getting 7 hours of sleep is too low. The
probabilities for group 1 & 2 is 0.369 & 0.382 respectively. In the group 3, there is a 64.8%
of probability of getting 7 hours of sleep. The probability of getting 7 hours of sleep for the
group 4 & 5 is 0.888 & 0.723 respectively. It also can be said that the faculty of higher group
has greater probability of getting 7 hours of sleep.
Answer 3.e
Total 1,182 578 984 1,544 712 5,000
2 877 547 365 776 223 2,788
1 305 31 619 768 489 2,212
Gender 1 2 3 4 5 Total
Faculty
Table 3.e: Gender Distribution across Faculty
From the above table, it can be said the faculty group 2 has the least number of female
and group 4 has the maximum number of female persons. Where as in the 1st group male
persons presence is high and in the group 5 male persons presence is least among other
groups.
Answer 3.d
Faculty N Observed k Observed p
1 1182 437 0.369
2 578 221 0.382
3 984 638 0.648
4 1544 1371 0.888
5 712 515 0.723
Table.3.d: Group Wise Probability
In the faculty group 1 & 2, probability of getting 7 hours of sleep is too low. The
probabilities for group 1 & 2 is 0.369 & 0.382 respectively. In the group 3, there is a 64.8%
of probability of getting 7 hours of sleep. The probability of getting 7 hours of sleep for the
group 4 & 5 is 0.888 & 0.723 respectively. It also can be said that the faculty of higher group
has greater probability of getting 7 hours of sleep.
Answer 3.e
Total 1,182 578 984 1,544 712 5,000
2 877 547 365 776 223 2,788
1 305 31 619 768 489 2,212
Gender 1 2 3 4 5 Total
Faculty
Table 3.e: Gender Distribution across Faculty
From the above table, it can be said the faculty group 2 has the least number of female
and group 4 has the maximum number of female persons. Where as in the 1st group male
persons presence is high and in the group 5 male persons presence is least among other
groups.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
10LOGISTIC REGRESSION AND APPLICATION ON STATA
11LOGISTIC REGRESSION AND APPLICATION ON STATA
Reference:
Norton, E. C., Miller, M. M., & Kleinman, L. C. (2013). Computing adjusted risk ratios and
risk differences in Stata. The Stata Journal, 13(3), 492-509.
Liu, D., Li, T., & Liang, D. (2014). Incorporating logistic regression to decision-theoretic
rough sets for classifications. International Journal of Approximate Reasoning, 55(1),
197-210.
Alsharif, A. A., & Pradhan, B. (2014). Urban sprawl analysis of Tripoli Metropolitan city
(Libya) using remote sensing data and multivariate logistic regression model. Journal
of the Indian Society of Remote Sensing, 42(1), 149-163.
Wooldridge, J. M. (2015). Introductory econometrics: A modern approach. Nelson
Education.
Reference:
Norton, E. C., Miller, M. M., & Kleinman, L. C. (2013). Computing adjusted risk ratios and
risk differences in Stata. The Stata Journal, 13(3), 492-509.
Liu, D., Li, T., & Liang, D. (2014). Incorporating logistic regression to decision-theoretic
rough sets for classifications. International Journal of Approximate Reasoning, 55(1),
197-210.
Alsharif, A. A., & Pradhan, B. (2014). Urban sprawl analysis of Tripoli Metropolitan city
(Libya) using remote sensing data and multivariate logistic regression model. Journal
of the Indian Society of Remote Sensing, 42(1), 149-163.
Wooldridge, J. M. (2015). Introductory econometrics: A modern approach. Nelson
Education.
1 out of 12
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.