Econometrics Homework: Question 3 Detailed Solution and Analysis
VerifiedAdded on 2020/11/12
|13
|1437
|270
Homework Assignment
AI Summary
This document presents a detailed solution to an econometrics assignment, specifically addressing Question 3. The solution begins with data cleaning and descriptive statistics, including the deletion of irrelevant observations and the creation of key variables like 'School' and 'Wage'. It includes the generation and analysis of histograms for wage and log wage ('lwage'), extracting mean and median values for both. The assignment further explores the assumptions underlying Ordinary Least Squares (OLS) estimation, analyzing the impact of using log wages instead of wages, and interpreting the marginal effects of variables. It then delves into OLS regression modeling, calculating confidence intervals, and testing hypotheses related to ethnicity's influence on log wage rates. The solution incorporates statistical outputs, regression equations, and interpretations to provide a thorough understanding of the econometric concepts applied. Finally, the assignment provides a conclusion and references related to the analysis.

ECONOMETRICS
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

TABLE OF CONTENTS
QUESTION 3...................................................................................................................................1
(a) Deleting observations who are not paid.................................................................................1
(b) Creating variable School and Wage along with other variable's descriptive statistics.........1
(c) Drawing histogram of wage and lwage.................................................................................2
(d) Extracting mean and median for both wage and lwage.........................................................3
(e) Listing assumptions under which estimated obtained through OLS are unbiased by
presenting the outcome...............................................................................................................4
(f) Application of log wages instead of wages............................................................................6
(g) Interpretation of y1................................................................................................................7
(h) Marginal effects.....................................................................................................................7
(I) Estimating linear regression model by OLS where lwage is constant...................................9
(j) Creating confidence interval of 95% for percentage return to additional year of schooling
...................................................................................................................................................10
(l) Testing that null hypothesis that ethnicity does not plays role to explore log wage rate.....10
CONCLUSION..............................................................................................................................10
REFERENCES..............................................................................................................................11
QUESTION 3...................................................................................................................................1
(a) Deleting observations who are not paid.................................................................................1
(b) Creating variable School and Wage along with other variable's descriptive statistics.........1
(c) Drawing histogram of wage and lwage.................................................................................2
(d) Extracting mean and median for both wage and lwage.........................................................3
(e) Listing assumptions under which estimated obtained through OLS are unbiased by
presenting the outcome...............................................................................................................4
(f) Application of log wages instead of wages............................................................................6
(g) Interpretation of y1................................................................................................................7
(h) Marginal effects.....................................................................................................................7
(I) Estimating linear regression model by OLS where lwage is constant...................................9
(j) Creating confidence interval of 95% for percentage return to additional year of schooling
...................................................................................................................................................10
(l) Testing that null hypothesis that ethnicity does not plays role to explore log wage rate.....10
CONCLUSION..............................................................................................................................10
REFERENCES..............................................................................................................................11

QUESTION 3
(a) Deleting observations who are not paid
The above table is articulated total paid work as 1 coded individuals is a wage sector
worker and 0 codes are the one who does not belong to wage sector worker. After, deleting
observations who do not do any paid work are 1692.
(b) Creating variable School and Wage along with other variable's descriptive statistics
Variable Obs Mean Std. Dev. Min Max
Paidwork 8748 0.386374 0.4869458 0 1
lwage 8748 0.2983099 0.5974387 -3.37688 4.208274
men 8748 0.4770233 0.4995003 0 1
malay 8748 0.4766804 0.4994844 0 1
chinese 8748 0.2844079 0.4511577 0 1
indian 8748 0.2389118 0.4264431 0 1
age 8748 33.62369 13.42969 15 65
agesq 8748 13.10888 10.20748 2.25 42.25
gexpr 8748 20.48331 15.83856 0 59
1
(a) Deleting observations who are not paid
The above table is articulated total paid work as 1 coded individuals is a wage sector
worker and 0 codes are the one who does not belong to wage sector worker. After, deleting
observations who do not do any paid work are 1692.
(b) Creating variable School and Wage along with other variable's descriptive statistics
Variable Obs Mean Std. Dev. Min Max
Paidwork 8748 0.386374 0.4869458 0 1
lwage 8748 0.2983099 0.5974387 -3.37688 4.208274
men 8748 0.4770233 0.4995003 0 1
malay 8748 0.4766804 0.4994844 0 1
chinese 8748 0.2844079 0.4511577 0 1
indian 8748 0.2389118 0.4264431 0 1
age 8748 33.62369 13.42969 15 65
agesq 8748 13.10888 10.20748 2.25 42.25
gexpr 8748 20.48331 15.83856 0 59
1
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

gesprsq 8748 6.703974 8.431564 0 34.81
yprim 8748 4.898491 2.076833 0 6
ysec 8748 2.243941 2.494018 0 14
fail 8748 0.1587791 0.3654909 0 1
School 8748 7.142433 3.934128 0 20
Wage 8748 2.983099 5.974387 -33.7688 42.08274
The above table is stating descriptive statistics of variables within 8748 observations after
dropping variables urban, unearn, househ, amtland and unearnx with drop command in stata.
(c) Drawing histogram of wage and lwage
Histogram of Wage
Histogram of lwage
2
yprim 8748 4.898491 2.076833 0 6
ysec 8748 2.243941 2.494018 0 14
fail 8748 0.1587791 0.3654909 0 1
School 8748 7.142433 3.934128 0 20
Wage 8748 2.983099 5.974387 -33.7688 42.08274
The above table is stating descriptive statistics of variables within 8748 observations after
dropping variables urban, unearn, househ, amtland and unearnx with drop command in stata.
(c) Drawing histogram of wage and lwage
Histogram of Wage
Histogram of lwage
2
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

The above graph is showing category of wage and lwage which has the highest presence
of category within range of -0.07688 to 0.02312 (wage) and -0.7688 to 0.2312 (lwage) as this
was the only highest range. Apart from this, it is decreasing on further basis.
(d) Extracting mean and median for both wage and lwage
Wage:
The command for extracting mean median in sum Wage, detail has been used whose
average is 2.923099 with 0 as median and standard deviation of 5.97.
lwage:
The command for extracting mean median in sum lwage, detail has been used whose
average is 0.2983099 with 0 as median and standard deviation of .5974387.
The main reason behind similar median is that both are associated to wage as logarithm is
considered as inverse function to exponentiation. In simple words, it signifies logarithm of
specified number as exponent to which other fixed number as base b should be raised to generate
number.
3
of category within range of -0.07688 to 0.02312 (wage) and -0.7688 to 0.2312 (lwage) as this
was the only highest range. Apart from this, it is decreasing on further basis.
(d) Extracting mean and median for both wage and lwage
Wage:
The command for extracting mean median in sum Wage, detail has been used whose
average is 2.923099 with 0 as median and standard deviation of 5.97.
lwage:
The command for extracting mean median in sum lwage, detail has been used whose
average is 0.2983099 with 0 as median and standard deviation of .5974387.
The main reason behind similar median is that both are associated to wage as logarithm is
considered as inverse function to exponentiation. In simple words, it signifies logarithm of
specified number as exponent to which other fixed number as base b should be raised to generate
number.
3

(e) Listing assumptions under which estimated obtained through OLS are unbiased by presenting
the outcome
The OLS assumptions are very important as it is best linear unbiased estimator as there
are desirable properties of this and need of separate discussion in detailed aspect. The
assumptions are state below:
Assumption of linearity: With fit in linear model to a data which is non-linearity related
and model would be incorrect and not reliable as well.
Assumption of Homoscedasticity: In case, errors are heteroscedastic, then it would be
difficult for purpose of trusting standard errors of OLS estimates.
Assumption of Independence/No Autocorrelation: This is most likely to be violated in
different models of time series regression and even there is no requirement for
investigating it.
Assumption of Normality of errors: In case of error terms which are not normal, then in
this context standard errors of estimates of OLS would not be reliable and signifies about
broad or narrow confidence intervales.
Assumption of No Multicollinearity: It could be traced through correlation matrix as it is
complex mode of tracing Variance Inflation Factor. The appropriate indication of multi-
collinearity with opposite signs for coefficient of regression.
4
the outcome
The OLS assumptions are very important as it is best linear unbiased estimator as there
are desirable properties of this and need of separate discussion in detailed aspect. The
assumptions are state below:
Assumption of linearity: With fit in linear model to a data which is non-linearity related
and model would be incorrect and not reliable as well.
Assumption of Homoscedasticity: In case, errors are heteroscedastic, then it would be
difficult for purpose of trusting standard errors of OLS estimates.
Assumption of Independence/No Autocorrelation: This is most likely to be violated in
different models of time series regression and even there is no requirement for
investigating it.
Assumption of Normality of errors: In case of error terms which are not normal, then in
this context standard errors of estimates of OLS would not be reliable and signifies about
broad or narrow confidence intervales.
Assumption of No Multicollinearity: It could be traced through correlation matrix as it is
complex mode of tracing Variance Inflation Factor. The appropriate indication of multi-
collinearity with opposite signs for coefficient of regression.
4
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

The equation of wage is α0+ α1school + u which could be numerically stated as α0+
0.497school - 0.5733793. the F ratio tests reflects that overall regression model is best fit for
data. The result shows that independent variables are statistically significantly predict about
dependent variable as F(1, 8746) = 1053.58, p< 0.0005. In the similar aspect, R square row
5
0.497school - 0.5733793. the F ratio tests reflects that overall regression model is best fit for
data. The result shows that independent variables are statistically significantly predict about
dependent variable as F(1, 8746) = 1053.58, p< 0.0005. In the similar aspect, R square row
5
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

shows R^2 value which is also known as coefficient of determination, which is directly
proportion of variance in depend net variable that could be elaborated through independent
variables. In technical terms, proportion of variation accounted through OLS regression model
above and beyond mean model. The r square value is 0.1075 that independent variables explains
10.75% of variability of dependent one as wage.
On basis is statistical significance of independent variables is performed with criteria of p
value is smaller than 0.05 then it could be summarised that coefficients are statistically
significantly different to 0. It could be observed through P>|t| column that independent variable
coefficients are statistically different by 0.
(f) Application of log wages instead of wages
It states about simple linear regression model where y (wage) is dependent and school is
termed as independent variable. In this aspect, term β0 and β1 are parameters of model. Here β0
is intercept and β1 is slope parameter. These are replicated as coefficients of regressions. The
equation of lwage is β0+ β1school + u which could be numerically stated as β0+ 0.0497school
- 0.05733793. The F ratio tests reflects that overall regression model is appropriate fit for data.
The result shows that independent variables are statistically significantly predict about dependent
variable as F(1, 8746) = 1053.58, p< 0.0005. The R^2 measures goodness of fitted model as is
0.1075 which is 10.74% and the main difference from above model is related to variable and
simple linear regression is implicated in this aspect.
6
proportion of variance in depend net variable that could be elaborated through independent
variables. In technical terms, proportion of variation accounted through OLS regression model
above and beyond mean model. The r square value is 0.1075 that independent variables explains
10.75% of variability of dependent one as wage.
On basis is statistical significance of independent variables is performed with criteria of p
value is smaller than 0.05 then it could be summarised that coefficients are statistically
significantly different to 0. It could be observed through P>|t| column that independent variable
coefficients are statistically different by 0.
(f) Application of log wages instead of wages
It states about simple linear regression model where y (wage) is dependent and school is
termed as independent variable. In this aspect, term β0 and β1 are parameters of model. Here β0
is intercept and β1 is slope parameter. These are replicated as coefficients of regressions. The
equation of lwage is β0+ β1school + u which could be numerically stated as β0+ 0.0497school
- 0.05733793. The F ratio tests reflects that overall regression model is appropriate fit for data.
The result shows that independent variables are statistically significantly predict about dependent
variable as F(1, 8746) = 1053.58, p< 0.0005. The R^2 measures goodness of fitted model as is
0.1075 which is 10.74% and the main difference from above model is related to variable and
simple linear regression is implicated in this aspect.
6

(g) Interpretation of y1
lwage= γ0 + .3069981lschool– 0.2796049
In the above scenario, y1 is replicated as coefficient of lschool as it is independent
variable.
(h) Marginal effects
Marginal effect is considered as measure of instantaneous effect which alters a specific
explanatory variable has on forecasted probability and other covariates are kept fixed. They are
obtained through computing derivative of conditional function of mean. The marginal effect on
particular independent variable is the derivative which is slope of specified function of
coefficients and covariates of preceding the estimation. There is application of mfx command
which has assumption that variables is particular estimation is independent.
7
lwage= γ0 + .3069981lschool– 0.2796049
In the above scenario, y1 is replicated as coefficient of lschool as it is independent
variable.
(h) Marginal effects
Marginal effect is considered as measure of instantaneous effect which alters a specific
explanatory variable has on forecasted probability and other covariates are kept fixed. They are
obtained through computing derivative of conditional function of mean. The marginal effect on
particular independent variable is the derivative which is slope of specified function of
coefficients and covariates of preceding the estimation. There is application of mfx command
which has assumption that variables is particular estimation is independent.
7
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

8
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

(I) Estimating linear regression model by OLS where lwage is constant
y= 0.130331 age + 0.0687472 school + 0.601283 Chinese + 0.531074 Indian + 0.21289 men –
0.7622766
The above is regression equation extracted from linear regression model by OLS as its
coefficients are stated in this as well. In the above scenario, r squared is 0.2123 is that
independent variables explains 21.23% of variability of dependent one as lwage and constant
report.
(j) Creating confidence interval of 95% for percentage return to additional year of schooling
(l) Testing that null hypothesis that ethnicity does not plays role to explore log wage rate
Null Hypothesis: There is no statistically significant role of ethnicity for exploring log wage
rate.
9
y= 0.130331 age + 0.0687472 school + 0.601283 Chinese + 0.531074 Indian + 0.21289 men –
0.7622766
The above is regression equation extracted from linear regression model by OLS as its
coefficients are stated in this as well. In the above scenario, r squared is 0.2123 is that
independent variables explains 21.23% of variability of dependent one as lwage and constant
report.
(j) Creating confidence interval of 95% for percentage return to additional year of schooling
(l) Testing that null hypothesis that ethnicity does not plays role to explore log wage rate
Null Hypothesis: There is no statistically significant role of ethnicity for exploring log wage
rate.
9

Alternative hypothesis: There is statistically significant role of ethnicity for exploring log wage
rate.
CONCLUSION
10
rate.
CONCLUSION
10
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 13
Related Documents

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.