logo

Statistical Methods for Business Research - Desklib

Analyzing the National Longitudinal Survey of Youth 1979 (NLSY79) dataset using Stata coding.

23 Pages5140 Words80 Views
   

Added on  2023-06-14

About This Document

This coursework covers questions on descriptive and inferential statistics, multiple regressions, dummy variables and logistic regressions. It includes STATA codes and statistical analysis of variables such as age, male, earnings, years of schooling, and usual number of hours worked per week in 2002.

Statistical Methods for Business Research - Desklib

Analyzing the National Longitudinal Survey of Youth 1979 (NLSY79) dataset using Stata coding.

   Added on 2023-06-14

ShareRelated Documents
Statistical Methods for Business Research
Department of Management, Birkbeck
Coursework Spring 2018
Student Name:
ID:
Date: 18th March 2018
Statistical Methods for Business Research - Desklib_1
Questions Descriptive and Inferential Statistics
Question 1
In STATA; codes provided in the appendix
Question 2
In STATA; codes provided in the appendix
Question 3
In STATA; codes provided in the appendix
Question 4
HOURS 540 40.53519 9.114845 10 60
S 540 13.52778 2.40384 6 20
EARNINGS 540 19.05415 14.18551 2.25 134.61
MALE 540 .5 .5004636 0 1
AGE 540 40.83333 2.18402 37 45
Variable Obs Mean Std. Dev. Min Max
. summarize AGE MALE EARNINGS S HOURS
Comments:
The above table gives a summary statistics for age, male, earnings, years of schooling and usual
number of hours worked per week in 2002. The average age was found to be 40.83 with the
highest participant being 45 years old and the youngest being 37 years old. The standard
deviation is 2.18 which shows that the data is not widely distributed. The mean for males was 0.5
indicating that an equal proportion of males and females was included in the study. Earnings
averaged at 19.05 with the highest earnings being 134.61 and the lowest earnings being 2.25.
The standard deviation for the earnings is also equally big showing a sought of widely
Statistical Methods for Business Research - Desklib_2
distributed dataset. The average years of schooling was 13.53 with the least number of schooling
and the highest number of schooling years being 6 and 20 respectively. The mean usual number
of hours worked per week in 2002 was 40.54 with the highest usual number of hours of work per
week being 60 while the lowest being 10 hours a week.
Question 5:
75 16 15 16
50 12.5 12 13
S 540 25 12 12 12
75 42.75 42 43
50 41 40 41
AGE 540 25 39 39 39
75 20.40385 20.03846 20.74951
50 18.01923 17.3931 18.55769
EXP 540 25 14.85096 14.34615 15.42308
Variable Obs Percentile Centile [95% Conf. Interval]
Binom. Interp.
. centile (EXP AGE S), centile (25 50 75)
Comments:
Table above gives the percentile values for three variables (total out-of school work experience
(years) as of the 2002 interview, age and years of schooling highest grade as of 2002). The 25th
percentile for the years of work experience was 14.85 with the median (50th percentile) being
18.02 while the 75th percentile being 20.40 years. For the respondent’s’ age, the 25th percentile
for the age was 39 with the median (50th percentile) being 41 years old while the 75th percentile
being 42.75 years old. The 25th percentile for the years of schooling was 12 with the median (50th
percentile) being 12.5 and the 75th percentile being 16 years.
Question 6:
Statistical Methods for Business Research - Desklib_3
Pr(T < t) = 0.7799 Pr(|T| > |t|) = 0.4402 Pr(T > t) = 0.2201
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Ho: diff = 0 degrees of freedom = 507
diff = mean(0) - mean(1) t = 0.7724
diff 1.431688 1.85348 -2.20976 5.073136
combined 509 19.09014 .6224029 14.04205 17.86734 20.31294
1 66 17.84409 1.700428 13.81434 14.4481 21.24008
0 443 19.27578 .6690373 14.08161 17.96089 20.59067
Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
Two-sample t test with equal variances
. ttest EARNINGS, by(POV78)
Comments:
A t-test was performed to test differences among means of earnings between individuals that
were living in poverty in 1978 and those that were not by using a t-test. We assumed equal
variances between the groups and the results are given in the above table. As can be seen, the p-
value for a two-tailed is 0.4402 (a value greater than 5% level of significance), we fail to reject
the null hypothesis and conclude that there is no significant difference in earnings between
individuals that were living in poverty in 1978 and those that were not.
Question 7:
Pearson chi2(1) = 9.7878 Pr = 0.002
Total 411 129 540
1 221 49 270
0 190 80 270
EDUCATION 0 1 Total
DIVORCED
. tabulate EDUCATION DIVORCED, chi2
Comments:
We ran a Chi-Square test of association to test whether there is a statistical association between
EDUCATION AND DIVORCED. Results are given in the above table where we see the p-value
Statistical Methods for Business Research - Desklib_4
to be 0.002 (a value less than 5% significance level), we therefore reject the null hypothesis and
conclude that there is strong evidence of significant association between EDUCATION and
DIVORCED.
Question 8:
0.0000 0.0000
SIBLINGS -0.3344 -0.3214 1.0000
0.0000
SF 0.6364 1.0000
SM 1.0000
SM SF SIBLINGS
. pwcorr SM SF SIBLINGS, sig
Comments:
The above table presents a Pearson correlation coefficient for the years of schooling, years of
schooling of the mother, years of schooling of the fathers and the numbers of siblings with aim
of testing whether they are linearly correlated according to the Pearson’s correlation coefficient.
We observe that there is a moderate positive linear relationship between years of schooling of the
mother and years of schooling of the fathers (r = 0.6364, p = 0.000). There is weak negative
relationship between years of schooling of the mother and the numbers of siblings (r = -0.3344, p
= 0.000). There was also weak negative relationship between years of schooling of the fathers
and the numbers of siblings (r = -0.3214, p = 0.000).
Question 9:
Statistical Methods for Business Research - Desklib_5
0.0000 0.0000
SIBLINGS -0.2874 -0.2524 1.0000
0.0000
SF 0.5998 1.0000
SM 1.0000
SM SF SIBLINGS
Sig. level
rho
Key
(obs=540)
. spearman SM SF SIBLINGS, stats(rho p)
Comments:
In this section we sought to replicate the results in question 8 for the Pearson correlation
coefficient with now a Spearman correlation coefficient. The results are presented in the above
table where we observe that the coefficient values have gone down thought the signs have
remained the same. That is, we observe that there is a moderate positive linear relationship
between years of schooling of the mother and years of schooling of the fathers (rho = 0.5998, p =
0.000). There is weak negative relationship between years of schooling of the mother and the
numbers of siblings (rho = -0.2874, p = 0.000). There was also weak negative relationship
between years of schooling of the fathers and the numbers of siblings (rho = -0.2524, p = 0.000).
Question 10:
The coefficient 𝛽 may represent a [ ( 1.01 ) β1 ]100 percentage change in Y. So, we can interpret
as: one percentage change in X results in 100 [ ( 1.01 ) β1 ] percentage change in Y while holding
all other variables constant.
Statistical Methods for Business Research - Desklib_6

End of preview

Want to access all the pages? Upload your documents or become a member.