Quantitative Research Methods Assignment: Analysis of Marital Quality

Verified

Added on  2023/06/03

|16
|3853
|122
Homework Assignment
AI Summary
This document presents a comprehensive solution to an analytical exercise in quantitative research methods. The assignment focuses on analyzing the factors influencing marital quality among Korean women, using data from a 2007 survey. The analysis begins by defining the policy question, identifying the dependent and independent variables, and summarizing their distributions. It then explores the relationship between husband's income and marital quality through correlation, scatter plots, and regression analysis, including interpreting coefficients and testing hypotheses. Furthermore, the solution addresses omitted variable bias by considering the husband's education and work hours as control variables, assessing the impact of these variables on the estimated effect of income. The assignment utilizes Stata for statistical analysis, requiring students to interpret regression outputs, calculate residuals, and evaluate the significance of findings. The final analysis includes the development of regression models to determine the factors affecting marital quality.
Document Page
Quantitative Research Methods
Student Name:
Instructor Name:
Course Number:
30 October 2018
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
1. (5 points) To begin with, understand the policy question.
a. (3 points) What is the policy question of this analysis? What is the dependent variable
and how is it measured? What is the key independent variable and how is it
measured?
Answer
What factors influence the marital quality of women? The dependent variable is the
marital quality and it is measured on a 7-point scale ranging from 1 to 7. The key
independent variable is the husband’s income measured in 1,000 USD per month.
b. (2 points) What is the population? What is the sample?
Answer
The population is all the Korean women aged between 20 and 64 while the sample is
a survey of 2,682 women surveyed in the 2007 for the population.
2. (4 points) Write a short paragraph which summarizes the distribution of the dependent
variable and the distribution of the key independent variable using summary statistics.
Answer
The average marital quality of the sampled women was found to be 5.38 with the highest
and the lowest marital quality recorded being 7 and 1 respectively. On the other hand, the
average husband’s income per month was found to be 2,748.5 US dollars per month with
the highest and lowest income recorded being 10,000 US dollars and 300 US dollars per
month.
hinc 2604 2.748522 1.291823 .3 10
maritalq 2676 5.381913 1.150974 1 7
Variable Obs Mean Std. Dev. Min Max
. summarize maritalq hinc
Document Page
3. (6 points) Before conducting regression analysis, it is valuable to examine the
relationship between hinc and maritalq with simple graphical and numerical summary.
a. (1 point) Calculate the correlation between hinc and maritalq.
Answer
2599 2604
0.0000
hinc 0.1574 1.0000
2676
maritalq 1.0000
maritalq hinc
. pwcorr maritalq hinc, obs sig
b. (2 points) Draw a scatter plot between hinc and maritalq.
Answer
0 2 4 6 8
Marital Quality
0 2 4 6 8 10
Husband’s income (in 1,000 USD per month)
A scatter plot of marital quality versus husband's income
c. (1 point) Based on what you find in Q3.a and Q3.b, how do you think the two
variables are related?
Answer
Document Page
The above results shows that there is a weak positive non-linear relationship between
husband’s income and the marital quality. The two variables are positively related as
can be seen from the correlation test and the scatterplot.
d. (2 point) Is there any evidence of a non-linear relationship in the scatter plot? If so,
describe the non-linearity.
Answer
Yes there is evidence of non-linear relationship. The non-linearity exhibited is the
positive non-linearity between the two variables (husband’s income and marital
quality).
4. (13 points) Now let’s examine the relationship with a regression with no control variable.
Output
_cons 5.007391 .0524165 95.53 0.000 4.904609 5.110174
hinc .1402914 .0172685 8.12 0.000 .10643 .1741527
maritalq Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 3435.90689 2598 1.32251997 Root MSE = 1.1359
Adj R-squared = 0.0244
Residual 3350.74905 2597 1.29023837 R-squared = 0.0248
Model 85.1578406 1 85.1578406 Prob > F = 0.0000
F( 1, 2597) = 66.00
Source SS df MS Number of obs = 2599
. reg maritalq hinc
a. (4 points) Run a regression of maritalq on hinc with no control variable. Write the
estimated regression equation based on the Stata regression output. What is the
estimated slope of hinc? Interpret the estimated slope.
Answer
The estimated regression equation is;
MaritalQ=5.0074+0.1403( Hinc)
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
The estimated slope is 0.1403; this means that a unit increase in the husband’s income
would result to an increase in the marital quality by 0.1403. Similarly, a unit decrease
in the husband’s income would result to a decrease in the marital quality by 0.1403.
b. (5 points) State the null hypothesis and the alternative hypothesis which correspond to
the slope coefficient of hinc in the Stata regression table. Can you reject the null
hypothesis in favor of the alternative hypothesis at 5% significance level? Justify your
answer using three kinds of evidence, (1) confidence interval, (2) t statistics, and (3)
p-value.
Answer
The hypothesis is;
Null Hypothesis (H0): Slope is not significantly different from zero
Alternative Hypothesis (HA): Slope is significantly different from zero
From the results we reject the null hypothesis.
Justification using confidence interval;
The 95% confidence interval is between 0.1064 and 0.1742, this clearly shows that
zero is not within the interval hence the rejection of the null hypothesis at 5% level of
significance.
Justification using t statistics
The computed t-value is 8.12; this value is clearly greater than the critical t value
from the tables (t = 1.96). This means that the null hypothesis is rejected at 5% level
of significance.
Justification using p-value
Document Page
The p-value is given as 0.000; this value is less than the α = 0.05; we therefore reject
the null hypothesis at 5% level of significance.
c. (4 points) How much did the husband of the woman whose pid is 24 earn per month
in 2007? Predict her marital quality based on her husband’s income and the estimated
regression line in Q4.a. What was her actual marital quality in 2007? Calculate
residual in her marital quality. (*For predicted values and residuals, refer to the last
two pages of lecture notes in Week 8.)
Answer
The husband earned 2,500 USD per month.
The estimated marital quality is;
MaritalQ=5.0074+0.1403(2.5)
MaritalQ=5.0074+0.35075=5.35815
The actual marital quality in 2007 was 5.
Residual=Actual valuePredicted value
Residual=55.35815=0.35815
5. (15 points) Now, we consider the husband’s education in years as a control variable.
Output
a. (2 points) Explain theoretically or intuitively how the two conditions for omitted
variables might be satisfied for the husband’s years of education.
Answer
Condition 1: The control variable (husband’s education in years) is intuitively
thought to be correlated with the husband’s income. Husbands with higher education
levels (more years of education) tend to have higher income levels.
Document Page
Condition 2: The control variable (husband’s education in years) affects the marital
quality. Higher education years might result to higher martial quality.
b. (2 point) Under the scenarios assumed in Q5.a, predict whether omitting the years of
education might make the estimated effect of the husband’s income upward or
downward biased. Briefly explain why.
Answer
Based on the above, omitting the years of education might make the estimated effect
of the husband’s income upward biased. This is based on the fact that years of
education is positively related with the marital quality.
c. (2 points) Now let’s run the regression with the additional control for the years of
education to the model in Q4.a. Run the regression and write the estimated regression
model.
Answer
_cons 4.29661 .1159383 37.06 0.000 4.069268 4.523951
hedy .0623724 .0090603 6.88 0.000 .0446061 .0801387
hinc .0827852 .0190244 4.35 0.000 .0454806 .1200897
maritalq Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 3418.67466 2584 1.32301651 Root MSE = 1.1262
Adj R-squared = 0.0414
Residual 3274.677 2582 1.2682715 R-squared = 0.0421
Model 143.99766 2 71.99883 Prob > F = 0.0000
F( 2, 2582) = 56.77
Source SS df MS Number of obs = 2585
. reg maritalq hinc hedy
The estimated regression equation is;
MaritalQ=4.2966+0.0828 ( Hinc ) + 0.0624( Herdy )
d. (2 points) Interpret the estimated slope coefficient of hinc.
Answer
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
The estimated slope is 0.0828; this means that a unit increase in the husband’s income
would result to an increase in the marital quality by 0.0828. Similarly, a unit decrease
in the husband’s income would result to a decrease in the marital quality by 0.0828.
e. (2 points) Are the two conditions for omitted variable bias associated with
educational attainment satisfied empirically? Answer with empirical evidence.
Answer
2660 2590 2666
0.0000 0.0000
hedy 0.1856 0.4339 1.0000
2599 2604
0.0000
hinc 0.1574 1.0000
2676
maritalq 1.0000
maritalq hinc hedy
. pwcorr maritalq hinc hedy, obs sig
Yes the two conditions are satisfied based on the empirical evidence. As can be seen
form the above table, there is positive relationship between husband’s education in
years and husband’s income (r = 0.4339). There is also positive relationship between
marital quality and husband’s education in years ( r = 0.1856)
f. (3 points) Based on your comparison of regressions in Q4.a and Q5.c, how does the
failure to control for husband’s education cause the estimated effect of the husband’s
income biased – upward, downward, or unbiased? Confirm the direction of the bias
using the empirical evidence from Q5.e.
Answer
Bias=0.14030.0828=0.0575
Bias=0.0575>0
Since the bias is greater than zero we can conclude that omitting the years of
education might make the estimated effect of the husband’s income upward biased.
Document Page
g. (2 points) Which model, either the model in Q4.a or the model in Q5.c, do you prefer
and why?
Answer
I do prefer model in Q5.c. This is based on the fact that an increase in the value of
adjusted R-Squared is observed. In model Q4.a, the value of adjusted R-Squared was
0.0248 while for the model in Q5.c the value is 0.0414. This shows that a slightly
higher proportion of the variation in the dependent variable is explained in model
Q5.c as compared to model in Q4.a.
6. (10 points) Now, we consider the husband’s work hours as another control variable.
a. (2 points) Run the regression that additionally controls for variables related with the
husband’s work hours to the model you chose in Q5.g. Run the regression, and write
the estimated regression model.
Answer
_cons 4.209194 .1184698 35.53 0.000 3.976889 4.4415
hhour_5060 .1160036 .0719192 1.61 0.107 -.0250215 .2570287
hhour_4050 .2303237 .0557868 4.13 0.000 .1209323 .339715
hedy .0588287 .0090725 6.48 0.000 .0410385 .0766188
hinc .074213 .0190765 3.89 0.000 .0368061 .1116199
maritalq Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 3418.67466 2584 1.32301651 Root MSE = 1.1227
Adj R-squared = 0.0472
Residual 3252.19362 2580 1.26054016 R-squared = 0.0487
Model 166.481046 4 41.6202616 Prob > F = 0.0000
F( 4, 2580) = 33.02
Source SS df MS Number of obs = 2585
. reg maritalq hinc hedy hhour_4050 hhour_5060
The estimated regression equation is;
MaritalQ=4.2966+0.0742 ( Hinc ) +0.0588 ( Herdy ) +0.2303 ( hhou r 4050 ) +0.1160( hhou r5060 )
b. (2 points) Interpret each and every slope coefficient of the work hours.
Answer
Document Page
The estimated slope for hincis 0.0742; this means that a unit increase in the husband’s
income would result to an increase in the marital quality by 0.0742. Similarly, a unit
decrease in the husband’s income would result to a decrease in the marital quality by
0.0742.
The estimated slope for herdy is 0.0588; this means that a unit increase in the
husband’s education in years would result to an increase in the marital quality by
0.0588. Similarly, a unit decrease in the husband’s education in years would result to
a decrease in the marital quality by 0.0588.
The estimated slope for hhour4050 is 0.2303; this means that husbands who work for
between 40 to 50 hours per week have higher marital quality by approximately
0.2303 as compared to the husbands who work 60 hours or more per week.
The estimated slope for hhour5060 is 0.1160; this means that husbands who work for
between 50 to 60 hours per week have higher marital quality by approximately
0.1160 as compared to the husbands who work 60 hours or more per week.
c. (4 points) Test a null hypothesis that the husband’s work hours do not matter to
marital quality against its alternative hypothesis. Using “β”s related to the husband’s
work hours, re-write the null hypothesis and write the alternative hypothesis. Can you
reject the null hypothesis at 1% significance level? Provide your evidence.
Answer
The null hypothesis that the husband’s work hours do not matter to marital quality
against its alternative hypothesis is rejected at 1% level of significance.
Using “β”s related to the husband’s work hours
H0 : β31=0
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
H A : β31 0
Where β31 refers to the beta related to husbands work hours of between 40 to 50 hours
per week. The p-value related with the beta is 0.000 (a value less than 1% level of
significance), we thus reject the null hypothesis for this beta. And conclude that the
beta ( β31) related to husbands work hours of between 40 to 50 hours per week is
significantly different from zero at 1% level of significance.
H0 : β32=0
H A : β32 0
Where β32 refers to the beta related to husbands work hours of between 50 to 60 hours
per week. The p-value related with the beta is 0.125 (a value greater than 1% level of
significance), we thus fail to reject the null hypothesis for this beta. And conclude that
the beta ( β32) related to husbands work hours of between 50 to 60 hours per week is
not significantly different from zero at 1% level of significance.
d. (2 points) What is the adjusted R squared of the regression in Q6.a? Interpret the
adjusted R squared.
Answer
The adjusted R squared of the regression in Q6.a is 0.0472; this implies that only
4.72% of the variation in the dependent variable (marital quality) is explained by the
three independent variables in the model.
7. (4 points) In Q3.d, we found some evidence of non-linearity between hinc and maritalq.
To relieve the concern about wrong functional forms, let’s examine the possibility of the
non-linear relationship for the regression in Q6.a.
Document Page
a. (2 points) To do so, create logged income of the husband by taking log of hinc. Then
regress maritalq on the logged hinc and the control variables included in Q6.a.
Interpret the estimated slope coefficient for the husband’s income. (*This question is
related to our learning in Week 11. For preview, you can create logged variable1,
named as logvariable1 (or use whatever name you like), as below:
gen logvariable1 = ln(variable1)
//ln denotes natural log.
//“l” in ln is the lower case of “L” (NOT number one).
Answer
_cons 4.270958 .119513 35.74 0.000 4.036607 4.505309
hhour_5060 .1032547 .0719845 1.43 0.152 -.0378985 .244408
hhour_4050 .2117787 .0561387 3.77 0.000 .1016972 .3218602
hedy .0538514 .0092799 5.80 0.000 .0356547 .0720481
loghinc .2486177 .0539729 4.61 0.000 .1427832 .3544522
maritalq Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 3418.67466 2584 1.32301651 Root MSE = 1.1214
Adj R-squared = 0.0495
Residual 3244.5868 2580 1.25759178 R-squared = 0.0509
Model 174.087863 4 43.5219659 Prob > F = 0.0000
F( 4, 2580) = 34.61
Source SS df MS Number of obs = 2585
. reg maritalq loghinc hedy hhour_4050 hhour_5060
The estimated slope for loghincis 0.2486; since the coefficient is 0.2486, marital
quality will change by 0.002486 units when the husband’s income changes by 1%.
b. (2 points) Note the model in Q6.a and the model in Q7.a have no difference in terms
of variables used in the regressions. Instead, the models are different only in terms of
the functional form of hinc. Which model do you prefer, and why? Justify your
answer with evidence.
Answer
I do prefer model in Q7.a. This is based on the fact that an increase in the value of
adjusted R-Squared is observed. In model Q6.a, the value of adjusted R-Squared was
chevron_up_icon
1 out of 16
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]