Analysis of Aggregate Consumption Function Using Time Series Data

Verified

Added on  2020/12/09

|25
|4752
|354
Report
AI Summary
This report provides a comprehensive analysis of the aggregate consumption function using time series data, focusing on the Australian economy. It begins by creating a time series dataset from the Eurostat database, discussing the concepts of consumption functions, anticipations, serial correlation, and stationarity. The report details the process of estimating an aggregate consumption function, addressing non-stationarity by examining alterations in the data. Task 2 involves an in-depth analysis of the time series dataset and a survey of consumer finances, utilizing logarithmic models, debt-to-income ratios, and gender-specific regressions. The report also presents the results of weighted and non-weighted regressions and includes diagnostic checks. The analysis uses Stata for econometric modeling and statistical analysis, including histograms, scatter plots, and regression outputs. The report concludes with a summary of the findings and includes do-files for both tasks in the appendix.
Document Page
Assessment
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Table of Contents
INTRODUCTION...........................................................................................................................3
Task 1...............................................................................................................................................3
1. Creation of time series dataset to calculate aggregate consumption function.........................3
Anticipations...............................................................................................................................4
Serial correlation.........................................................................................................................4
Stationarity..................................................................................................................................5
Estimating an aggregate consumption function..........................................................................8
Task 2.............................................................................................................................................10
1 Analysis of time series dataset and survey of consumer finances.........................................10
Logarithmic model....................................................................................................................12
Debt-to-income ratio.................................................................................................................14
Gender-specific regressions or dummy variables.....................................................................15
Main specification.....................................................................................................................17
Diagnostic checks......................................................................................................................17
Including extra variables and interaction terms........................................................................18
CONCLUSION..............................................................................................................................19
Appendix........................................................................................................................................20
Do-file for Task 1......................................................................................................................20
Do-file for Task 2......................................................................................................................22
Document Page
INTRODUCTION
Time series dataset refers to data points which are graphed with respect to time. It
basically includes series taken at consecutive distributed points in time. Therefore, it is a
succession of discrete-time data. Consumption function refers to functional relationship between
gross national income and total consumption (Anselin, 2013). This report deals with creation of
time series dataset so that overall consumption function can be evaluated. Furthermore, synthesis
of time series dataset is done and also analysis of dataset which is based on survey of finances of
consumers is provided.
Task 1
1. Creation of time series dataset to calculate aggregate consumption function.
Time series dataset refers to quantity that represent values which are taken by variable or
entity in certain time like year, quarter or month. This series occur when same standards are
canned on regular basis. Consumption function is defined as a economic formula which is used
to correspond relationship between overall consumption and gross national income.
In this report, country which is chosen is Australia. This report contains data which have
been used derived from Eurostat database. It is a database with around 4600 datasets which
comprises of approx 1.2 billion statistical data values. Eurostat is considered as exploit of
statistical collection of information. It contains data of every quarter for minimal gross
expendable income for household sector, their terminal consumption disbursement of households
and price index which was 2013=200 Data is obtained from Eurostat database which is
accessible at million units of currency. Therefore, consumer price index is used which is
available on OECD website of Australia. In that house price index which was obtained from
Eurostat for 2013=200. It is not similar to inflation but it is approximate value which has been
used for creation of dataset in real conditions (Anselin, Florax and Rey, 2013).
There is no data available with respect to Australia; therefore data which is utilized is
fluctuated across quarters. House price index (hpi) refers to measurements of fluctuation in
prices of residential houses in a form of percentage change from particular start date. For this
repeat-sales regression, simple moving average and hedonic regression can be used for
calculations. It will function as proxy for wealth which is on based two variables which will be
Document Page
used in regression. After this alterations are made with usage of Microsoft Excel and then this
file has been exported to Stata. In this time dimensions were fixed and different entities were
renamed. Furthermore, Moodle is used for assessment.
Anticipations
Some assumptions were taken for time series regression, so that unbiased estimators can
be calculated
ï‚· stochastic process which relates with linearity in parameters;
{(xt1, xt2, …, xtk, yt) where t varies from 1 to n and follows linear model which
includes value from yt = O0 + O1xt1 + …+ Okxtk + ut where range of t in ut is between 1 to n
which denotes errors (this includes sequence of periods, different parameters) and k is a variable.
ï‚· Independent variables are not constant as well as they are not perfect linear accumulation
of other independent variables which are present in sample which means that there is no
perfect co linearity.
 Accepted value of error ut is zero for every t’s value, such that E(ut|X) = 0), where E
represents explanatory variables and value of t lies in between 1 to n.
Ordinary least squares (OLS) value will be similar to true parameters, when these
assumptions or anticipations are taken into consideration. This is known as theorem of
unbiasedness of OLS for particular time series regressions.
Serial correlation
In context of cross-section regression, vital anticipation which was taken considers that
different observations on e and y are not related with each other that is cov (yt, ys) = cov (et, es) =
0 for t  s where t and s refer to different time periods (Asteriou and Hall, 2015). In this s and t
denotes unlike time periods. But as per our considerations, these anticipations will be doubtful to
cling to. Wealth, income and consumption expenditure which are interrelated as they change
steadily and not rapidly, these values will be dependent on previous values which were obtained
at another time period this means that they are correlated.
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Figure 1 Autocorrelation of disposable income and expenditure consumption for ten different
lags.
In above figure, in both the cases for initial four autocorrelations are distinct from zero with 5%
level. When data is tested using Stata for autocorrelation then outcome obtained is clearer as it
can be seen in above figure. When taken into consideration, then it shows that disposable income
is dependent on previous disposable income (Brooks, 2019). These outcomes follow anticipation
for theorem of unbiasedness of OLS estimators for time series regression. It is not reliable to
depend on variance formula for estimator.
Stationarity
Anticipation is desecrated in dataset, therefore this section is taken into consideration.
From graphical representation it is clear that disposable income and consumption expenditure
will grow with time. Graphical representation will be in upward direction as it is already
mentioned above that Eurostat dataset is not adjusted seasonally rather than it is across quarter.
Document Page
Figure 2 Household consumption expenditure and real gross disposable household income over
million units of Polish zloty.
Time series regression depends on anticipation that variables which are under consideration are
stationary. A time series yt is stationary that means that mean and variance are stable within time
and covariance of two values of series is only dependent on time length which separates two
values and they are not dependent on variables at exact time (Elhorst, 2014).
As per Dickey-Fuller test which is for unit root test for null hypothesis has a unit root which
indicates that variables are non-stationary. Null hypothesis must be considered at reasonable
levels when test is conducted for household consumption expenditure and disposable income.
For this non-stationary objects are taken into account. Unit root null hypothesis will be
considered for house price index.
Document Page
Figure 3 Consumption expenditure with disposable income and house price index with time in
quarters
Above graph shows house price index with reference to consumption expenditure with time.
These graphs have different scales which depicts non-stationarity. Apart from this, house price
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Figure 4 Alterations in real final household consumption expenditure and real gross disposable
income of households.
To handle this non-stationarity it is better to deal with alterations which have been occurred
rather than variables themselves. From graphs, it is clear that they do not follow any path but
they are more fluctuated. When Dickey-Fuller test is considered then null hypothesis is not
considered in both the cases i.e. disposable income and expenditure (Friedman, 2018).
Estimating an aggregate consumption function
Anticipations have been declared and certain reasons due to which they are not followed
are mentioned, now aggregate consumption function for Australia have been approximated. Two
variables are not stationary and these variables cannot be used in regression models. For this
aggregate consumption function must be estimated accordingly. In such cases, it has been found
that value of house price index is high. Best functional form dataset y1998 i5.dta is shown here:
∆ ln(consexp)t = ∆ ln (inc)t + ∆ hpit + et
This specification do not have any kind of constant. In this case p-value of regression will be low
and determination coefficient will be high.
Figure 5 Stata outcome of regression of difference in final consumption expenditure on
fluctuation in disposable income and divergence of house price index without constant.
Document Page
This outcome of Stata can be interpreted as 1% modification in presesnt income to
previous income will yield -0.56% alteration in consumption of expenditure. Mean value lies
closer to 0.2 but in histogram it cannot be clearly seen.
Task 2
1 Analysis of time series dataset and survey of consumer finances.
Dataset y 1998 i5.dta has been used which was published in moodle platform.
Regressions, histograms, data and manipulation which are illustrated in this are based on
particular dataset.
Histograms:
Histograms of few variables are created. In figure two histograms are illustrated out of
which one is weighted and other histogram is non-weighted (Gourieroux and Jasiak, 2018). It is
clear that weights make a significant difference in income. SCF dataset contains weights which
are build as frequency weights. Our sample have 331 observations which further contain 3895
records. These 3895 values are weighted uniquely, for this histograms, weighted regressions etc.
are considered for task. In case of histograms income is limited to 1 million USD, as structure of
data cannot be derived when limits are not applied.
Document Page
0 2 . 0 e - 0 64 . 0 e - 0 66 . 0 e - 0 68 . 0 e - 0 6
D e n s i t y
0 200000 400000 600000 800000 1000000
income
0 2 . 0 e - 0 64 . 0 e - 0 66 . 0 e - 0 68 . 0 e - 0 61 . 0 e - 0 5
D e n s i t y
0 200000 400000 600000 800000 1000000
income
Figure 6 Histograms of incomes under 1 million USD (upper histogram is not weighted while
lower is weighted)
As per histogram of debt and houses which are weighted, it is found that individuals do not have
their own houses and debt. Distribution of histograms in case of density function of debt is
skewed to left when they are compared with houses density function (Henderson and Parmeter,
2015).
0 5 . 0 e - 0 61 . 0 e - 0 51 . 5 e - 0 52 . 0 e - 0 52 . 5 e - 0 5
D e n s i t y
0 200000 400000 600000 800000 1000000
houses
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
0 1 . 0 e - 0 5 2 . 0 e - 0 5 3 . 0 e - 0 5
D e n s i t y
0 200000 400000 600000 800000 1000000
debt
Figure 7 Houses and debts histograms, they are weighted.
Scatter plots
They refers to rigid of points which are plotted in vertical as well as horizontal axes.
They are essential as they depict extent of correlation between variables (Liu, 2015). In these
plots debt and income is accumulated at bottom left quadrant. This signifies that there are more
people who have less income. At bottom, blue line signifies income. There are some people who
have high income.
Figure 8 Debt-income scatter plot for debt and no limit for age.
Relationship between debt to age is shown in right scatter. Individuals of every age group are not
obligated or indebted. When range will be modified i.e. below 10,000 USD these trends will not
be clearer. This signifies that debt can occur at any age but reasons behind that can vary. These
plots are non-weighted as they creates large dots which are not significant.
Document Page
Logarithmic model
Regression refers to standard of relationship between mean value of one variable with
respect to other variable (Mukherjee, Wuyts and White, 2013). Population model refers to
different type mathematical models which are applied to evaluate population dynamics. First
regression is dependent on this population model which is described below:
For debt population model is presented here:
ln = β0 + β1×income + β2×houses + β3×female + β4×age + β5×age2 + u
When this specifications are passed into Stata, then result obtained for non-weighted regression
is shown below:
Figure 9 Stata output for non-weighted regression of ln (debt) on income, houses, female, age
and age2.
From above, it can be interpreted that coefficients of houses and income are small and
positive. Debt is reduced by almost 115% c.p. (ceteris paribus) as coefficient of females is
negative. The coefficient of age2 and age is negative and positive which indicates that debt (ln)
has positive impact on age and for higher ages debt (ln) is small (Phlips, 2014).
For a weighted regression, we obtain:
Document Page
Figure 10: For weighted regression, stata output
Coefficients of houses and income are greater. In this case coefficient of female is
decreased which indicates that household debt has been reduced by around 66.95%. It is similar
for age and age2. In this case p-values are zero that means null hypothesis is equivalent to zero
means this can be dropped. P-value signifies that income is low. In non-weighted regression
income was not zero means p-value is non-zero. In case of weighted regression, coefficient are
not extreme with respect to size
R2 which is a coefficient of determination is set to a value which greater than non-weighted
regression. For this, instead of hhsex, dummy variable (dmy is used) female is utilised.
Debt-to-income ratio
DTI ratio refers to personal finance standard which is used to compare monthly debt
payment of individual with their gross income of specific month (Stock and Watson, 2015).
Population model in this case is shown below:
= β0 + β1×income + β2×houses + β3×female + β4×age + β5×age2 + u
When this regression is executed then result obtained is as:
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Figure 11 Outcome of Stata in case of weighted regression
From above it is found that coefficient of determination has been decreased by 0.0148 in
both the cases. This means that there is variation of 1.49% in DTI ratio which has been identified
by usage of explanatory variables. Whereas, coefficient of income is negative and for female is
positive which means that they both are counterpart.
Logarithmic specification is essential as endogeneity problem can appear as income is
considered in both explained and explanatory variable.
Gender-specific regressions or dummy variables
When regression is executed on the basis of conditions which includes different variables
with their values, here hhsex means female are heads of their house and hhsex = 1 specifies
male. In this case two regressions are obtained which are in different size coefficients. An
example can be considered, for female household heads effect of income is high that signifies it
is positive. Effect on ln (debt) in case of females is double when compared with males.
Document Page
Figure 12 Stata outcome for conditional regression in hhsex
Coefficient of determination will be higher in case of males i.e. hhsex=1. Stats outcome can be
easily interpreted with the usage of dummy variables. Anticipation of non-collinearity will be
violated when both dummy variables are used in same regression. In this case constants must be
avoided.
Document Page
Main specification
Specification which is suitable for Datasety1998i5 is mentioned below:
ln (debt) = β0 + β1×ln (income) + β2× ln (houses) + β3×female + β4×age + β5×age2 + u
Stata plot which has been obtained by this is:
Figure 13 Main specification chosen
0.8929% alteration in debt signifies 1% modification in income. Similarly, 1% change in inhouse
signifies 0.293% alteration in debt. Female household has been enhanced by 16.79%. This
signifies that savings have negative impact.
This specification is considered, as coefficient of determination is greater by 0.38, null
hypothesis which will have zero coefficients will be avoided and log-log models are spontaneous
to understand.
Diagnostic checks
In this case, residuals are predicted and after that they are summarised. The mean which
is obtained is around zero. But residuals are not distributed eventually.
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
0 .1 .2 .3
Density
-20 -15 -10 -5 0 5
Residuals
-20 -10 0 10 20
Residuals
-20 -10 0 10 20
Inverse Normal
Figure 14 Residual histogram and qnorm plot
Including extra variables and interaction terms
There is barely any change in other coefficients when extra variables are included.
Coefficient of determination and constant has been slenderly increased and constant (West,
Welch and Galecki, 2014). There is hardly any impact on residuals but their mean will lie closer
to zero.
Figure 15 Stata output for regression with extra variables
There are high purchases when intensified outstanding liabilities are considered. By this
residuals of histograms got more closer to normal distribution. This is shown below:
Document Page
0 . 1 . 2 . 3
D e n s it y
-15 -10 -5 0 5
Residuals
Figure 16 Histogram of residuals. Regression using X720 and interaction term
CONCLUSION
From above report, it can be concluded that time series dataset is created by which
collective consumption function can be approximated. Australia is taken as a country to put this
dataset (i.e. dataset y1998 i5.dta). In this private consumption is measured, furthermore
aggregate household disposable income is evaluated. Moreover, dynamic OLS estimator is used
for estimating accumulated consumption function. Apart from this descriptive analysis is done
with usage of scatter plots and histograms and debt accumulation functions are approximated.
Document Page
REFERENCES
Books & Journals
Anselin, L., 2013. Spatial econometrics: methods and models(Vol. 4). Springer Science &
Business Media.
Anselin, L., Florax, R. and Rey, S. J. eds., 2013. Advances in spatial econometrics: methodology,
tools and applications. Springer Science & Business Media.
Asteriou, D. and Hall, S. G., 2015. Applied econometrics. Macmillan International Higher
Education.
Brooks, C., 2019. Introductory econometrics for finance. Cambridge university press.
Elhorst, J. P., 2014. Spatial econometrics: from cross-sectional data to spatial panels (Vol. 479,
p. 480). Heidelberg: Springer.
Friedman, M., 2018. Theory of the consumption function. princeton university press.
Gourieroux, C. and Jasiak, J., 2018. Financial econometrics: Problems, models, and methods.
Princeton University Press.
Henderson, D. J. and Parmeter, C. F., 2015. Applied nonparametric econometrics. Cambridge
University Press.
Liu, X., 2015. Applied ordinal logistic regression using Stata: From single-level to multilevel
modeling. Sage Publications.
Mukherjee, C., Wuyts, M. and White, H., 2013. Econometrics and data analysis for developing
countries. Routledge.
Phlips, L., 2014. Applied Consumption Analysis: Advanced Textbooks in Economics (Vol. 5).
Elsevier.
Stock, J. H. and Watson, M. W., 2015. Introduction to econometrics.
West, B. T., Welch, K. B. and Galecki, A. T., 2014. Linear mixed models: a practical guide
using statistical software. Chapman and Hall/CRC.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Appendix
************Compile dataset************
*import data from excel file
cd "\\C:\Users\Desktop\Assessment\Methods"
import excel "\\C:\Users\Desktop\Assessment\Methods\TS_dataset.xls", sheet("stata") firstrow
*set time dimension with quarterly data by using new var quarters
gen quarters=quarterly(TIME,"TQ")
tsset quarters, quarterly
*rename variables to make them easier to use
rename husconsexpREAL husconsexp
rename DispIncREAL inc
rename HousePriceIndex hpi
*label variables so that teacher can understand
label var husconsexp "real final household consumption expenditure"
label var inc "real disposable household sector income"
label var hpi "house price index, 2013=200"
*save file in order to hand in
save "\\C:\Users\Desktop\Assessment\Methods\tsd.dta", replace
Do-file for Task 1
***********TASK 1 (SELF-COMPILED DATASET)**********
clear all
*set directory and load dataset
cd "\\C:\Users\Desktop\Assessment\Methods"
use "tsd.dta"
*exploration of data and plot them against each other in quarters
***scatter
twoway (scatter husconsexp disinc, msize(small))
scatter husconsexp quarters //upwards trend
scatter huspi quarters
scatter disinc quarters //upwards trend
***plot variables in same graphs
tsline husconsexp
tsline disinc
tsline huspi
tsline husconsexp disinc
tsline husconsexp disinc huspi
Document Page
* autocorrelation test, initiate from first consumption expenditure
summarize husconsexp
scatter husconsexp L.husconsexp, xline(`r(mean)') yline(`r(mean)')
sum husconsexp
return list
ac husconsexp, lags(10) // there is difference from 0 to 5% level in initial 4 autocorrelations
corrgram husconsexp, lags(10)
*** autocorrelation test with disposable income
summarize disinc
scatter disinc L.disinc, xline(`r(mean)') yline(`r(mean)')
summarize disinc
return list
ac disinc, lags(10) // again there is difference of 5% level
corrgram disinc, lags(10)
***similarly autocorrelation test for house price index
summarize huspi
scatter huspi L.huspi, xlin(`r(mean)') yline(`r(mean)')
sum huspi
return list
ac huspi, lags(10) //In 1st 3 autocorrelations, there is relevant difference from 0 to 5% level
corrgram huspi, lags(10)
*autocorrelation for stationarity
***Dickey-Fuller test denotes that null hypothesis cannot be avoided in unit root
dfuller husconsexp, regress lags(1)
dfuller disinc, regress lags (1)
dfuller huspi, regress lags (1)
*test autocorrelation for differences in stationarity
***in all cases unit root null hypothesis is avoided
dfuller D.husconsexp, regress lags(0)
dfuller D.disinc, regress lags(0)
dfuller huspi, regress lags(0)
***difference is plotted in order to evaluate that trend is stationary or not
tsline D.husconsexp
tsline D.disinc
tsline huspi
*estimation of consumption function consideration is given to with and without differences
regress husconsexp disinc huspi // p-value will be broad for cons and huspi coefficient
Document Page
***with differences
regress D.husconsexp D.disinc D.huspi // huspi coefficients p-value will be high
regress D.husconsexp D.disinc huspi //cons and huspi coefficient will have high p-value
*** logs
gen lncon=ln(husconsexp)
gen lninc=ln(disinc)
gen lnhpi=ln(huspi)
regress lncon lninc huspi //high p-value of huspi coefficient
regress husconsexp disinc lnhpi //huspi coeff and cons have high p values
regress D.lncon D.lninc D.lnhpi //high pvalue for lnhpi coefficient
regress D.lncon D.disinc D.huspi //in case of D.huspi coeff at p-value is 0.14!
regress D.lncon D.lninc huspi
regress D.lncon D.lninc huspi, noconstant //lninc t -10.05, p-value is 0
regress D.lncon L(0/3).D.lninc L(0/3).D.huspi
regress D.lncon L(0/3).D.lninc huspi
***try w/o huspi
regress D.husconsexp D.disinc //values have been improvised, p-values 0, t=-11.28 for disinc
predict ehat, residual
ac ehat, lags(15)
regress D.ehat L.ehat L.D.ehat, noconstant
dfuller ehat, lags(1)
hist ehat
sum ehat // mean is around zero
drop ehat //for later
regress D.lncon D.lninc D.huspi, noconstant //for lninc t -7.29, p-value is 0
predict ehat, residual
sum ehat //mean 0.0175
hist ehat
regress D.ehat L.ehat L.D.ehat, noconstant //null hypothesis is not integrated neither avoided.
dfuller ehat, noconstant lags(1)
ac ehat
Do-file for Task 2
*importing SCF data
cd "\\C:\Users\Desktop\Methods"
use "dataset y1998 i5.dta"
*Importance of Survey weights
generate wgt2=round(wgt)
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
hist wgt2
summarize wgt2
summarize income
summarize income [fweight=wgt2]
* Creation of histograms to attain overview of data
hist income if income<=1500000
hist income[fweight=wgt2] if income<=1500000
hist debt [fweight=wgt2] if debt <=1500000
hist houses [fweight=wgt2] if houses<=1500000
hist age [fweight=wgt2]
*creation of scatter plots to gain an overview of over data
scatter debt income
scatter debt income if income<=1500000 & debt<=1500000 //to avoid focusing on outliers
scatter debt houses
scatter debt age
scatter debt age if debt<=150000 //obviate focus on outliers
scatter debt hhsex [pweight=wgt] // male respondents are more as they are more indebted
*creation of age square variable, for the regression analysis
generate age2=age^2
*lndebt variable for the regression analysis
generate lndebt=ln(debt)
replace lndebt=0 if debt==0
**other logarithmic transformations
gen lninc=ln(income)
gen lnhouses=ln(houses)
replace lnhouses=0 if houses==0
sum houses, detail
*debt to income ratio for regression analysis
generate debtincratio=debt/income
*indicator variables
sum hhsex
gen female=1 if hhsex==2
replace female=0 if hhsex==1
gen male=1 if hhsex==1
replace male=0 if hhsex==2
*execution of regression with and without weights
regress debt income houses
regress debt income houses [fweight=wgt2]
Document Page
*usage of regression and dummy variable trap, avoid latter by adding , noconst
regress debt income houses male female [pweight=wgt], noconst
regress debt income houses female age age2 //negative constant
income eff small
regress debt income houses female age age2[fweight=wgt2]
*logarithmic model
regress lndebt income houses female age age2
regress lndebt income houses female age age2 [fweight=wgt2]
*debt-to-disinc ratio
regress debtincratio income houses female age age2 [fweight=wgt2]
*separation of regressions for genders
regress lndebt income houses age age2 [fweight=wgt2] if hhsex==1
regress lndebt income houses age age2 [fweight=wgt2] if hhsex==2
regress lndebt income houses male age age2 [fweight=wgt2]
regress lndebt income houses female age age2 [fweight=wgt2]
regress lndebt income houses female male age age2 [fweight=wgt2], noconst
*main specification
regress lndebt lninc lnhouses female age age2 [fweight=wgt2]
predict resids, res
hist resids [fweight=wgt2]
*diagnostics of regression
regress lnhouses lninc [fweight=wgt2]
predict resids, res
qnorm resids
hist resids [fw=wgt2]
hist resids [fweight=wgt2] , normal normopts (lcolor(red)) kdensity, if resids <5
sum resids [fweight=wgt2], detail
mean resids [fweight=wgt2]
drop resids
*skewness-kurtosis test for normality
sktest resids [fweight=wgt2]
*inlcude X718 and X720
***create dummy
sum X718
gen inh=1 if X718==1
label var inh "inheritance or gift"
Document Page
replace inh=0 if X718==5
replace inh=0 if X718==0
sum inh
***Regress
regress lndebt lninc lnhouses female age age2 inh [fweight=wgt2]
*****diagnostics
predict resids, res
qnorm resids
hist resids [fw=wgt2]
hist resids [fweight=wgt2] , normal normopts (lcolor(red)) kdensity, if resids <5
sum resids [fweight=wgt2], detail
mean resids [fweight=wgt2]
drop resids
***with X720
regress lndebt lninc lnhouses female age age2 inh X720 [fweight=wgt2]
*****diagnostics
predict resids, res
qnorm resids
hist resids [fw=wgt2]
hist resids [fweight=wgt2] , normal normopts (lcolor(red)) kdensity, if resids <5
sum resids [fweight=wgt2], detail
mean resids [fweight=wgt2]
drop resids
*normal distribution is not attained
***terms of interactions
gen lninclnhouses=lninc*lnhouses
regress lndebt lninc lnhouses female age age2 inh X720 lninclnhouses [fweight=wgt2]
*****diagnostics
predict resids, res
qnorm resids
hist resids [fw=wgt2]
hist resids [fweight=wgt2] , normal normopts (lcolor(red)) kdensity, if resids <5
sum resids [fweight=wgt2], detail
mean resids [fweight=wgt2]
drop resids
chevron_up_icon
1 out of 25
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]