Quantitative Methods: Regression Analysis, Model Problems & Solutions

Verified

Added on 2023/06/15

AI Summary

This assignment solution delves into various quantitative methods, addressing key concepts and potential problems encountered in regression analysis. It begins by examining multicollinearity, explaining its causes, effects on model interpretation, and methods for detection and resolution, such as removing correlated variables or using principal component analysis. The document then discusses dummy variables, their purpose in representing categorical data, and their application in regression models. Furthermore, it explores autocorrelation, its causes, consequences for ordinary least squares (OLS) estimators, and limitations of the Durbin-Watson test. Finally, the solution differentiates between fixed effects and random effects models, highlighting their assumptions, applications in panel data analysis, and implications for statistical inference. This assignment is designed to provide a comprehensive understanding of these essential econometric concepts.

Quantitative Methods

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

TABLE OF CONTENTS
QUESTION 1..................................................................................................................................2
Presenting whether the model suffer from multicolienraity problem..........................................2
QUESTION 2..................................................................................................................................3
QUESTION 3..................................................................................................................................3
a) Dummy variable and purpose of it..........................................................................................3
b) Test..........................................................................................................................................3
c) Interpretation...........................................................................................................................4
QUESTION 4..................................................................................................................................4
a) meaning and cause of autocorrelation.....................................................................................4
b) Drawbacks of Durbin Watson test...........................................................................................5
QUESTION 5..................................................................................................................................6
Major difference between fixed effects and random effects models...........................................6
REFERENCES................................................................................................................................8

QUESTION 1
Presenting whether the model suffer from multicolienraity problem
While deriving results by using regression analysis, it has been identified that there are
chances of different errors and this can be termed as a multicollearity. Yes, the above model and
generated outcome faces multicollinearity problem because the independent variable is highly
correlated with more than other independent variables as per the given regression equation. As a
result, it makes hard to interpret the model and sometimes lead to overfitting a problem (Daoud,
2017). In the context of given results, it can be analysed that demand for passenger cars is
depend upon five different variables which include new car consumer price index, consumer
price index, personal disposable income, interest rate, employed civilian labour force. Therefore,
one predictor variable which is mainly used to predict other that create redundant information
that somehow affect the results in negative manner. On the other side, this can be detected
through a model which is known as variance inflation factor that help to determine predictable
variable.
In addition to this, it can be stated that multicollinearity does not affect the model’s
predictive accuracy but affect results. That is why, there is a need to remove this problem, as it
leads to difficulty in testing individual regression coefficient due to fluctuated standard error. As
a result, statistician does not able to declare X has a significant and has strong relationship with
Y. On the other hand, it has been identified that there is a high correlation between all the
predictor variables and this in turn create redundant information that skew the results in opposite
manner (Weaving and et.al., 2019). That is why, as per the results generated, it has been
identified that the value of R square is 0.85 which means that there is 85% chances where the
dependent variable affected from another one. Thus, changes in the overall result affect the
demand of car which cause adverse impact over results. Thus, to fix the issue, there is a need to
remove some of highly correlated independent variables that helps to provide effective results.
Also, perform the analysis for highly correlated variable which include a principal components
analysis. This in turn causes positive impact upon results and remove the problem of multi-
collinearity.
Another solution that can be used to solve the problem includes linearly combine all the
independent variable so that they all provide effective results. In the context of defined question,

all the values of dependent variable can be sum up and put in a single equation that helps to
determine the results in effective manner. Moreover, it can be stated that first determine collinear
independent variable and then remove the same because elimination always assists to derive
better outcome and assist to examine correlation (Majid, Amin and Akram, 2021). On the other
side, it has been identified that by using VIF for each predicting variable, the variation might be
improving and this causes positive impact over the results. Overall, it has been realized that there
is a need to improve the issue as the variable Y is highly dependent upon different variable and
the results might be fluctuating due to change in errors. To improve such issue, it can be stated
that there is a need to eliminate the issue by combining more and more collinear variables into a
single one. Through these measures the chances of demand function for passenger cars can be
minimized and determine the results in an effective manner.
QUESTION 2
QUESTION 3
a) Dummy variable and purpose of it
A dummy variable is a numeric variable which is mainly represent in categorical data like
gender, race, political etc. In order to represent a categorical variable, assume the value in k and
this can be defined under dummy variable. It is called as a dummy variable because it is referring
to an artificial attributes and used with two or more categories or levels. For example, in
statistics, a dummy variable takes only the value of 0 and 1 which in turn indicate the absence or
presence of categorical effect that might be sift to another in order to present effective outcome.
The main purpose of using this dummy variable is such that it is useful because it enables a
scholar to use single regression equation in order to present multiple groups (Roh, 2020). Here,
there is no need to write about a separate equation models for each group and this in turn act as
switches which turn different parameters on and off within a regression equation. Thus, it can be
stated that through dummy variable, the answer can or not be vary because the answer can be
swipe only does not affect in majority of the changes.
b) Test
Regression is the test applied as per the generated output in which dependency of a
dependent variable can be determined over independent one. This tool is mainly used to estimate

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

the effect of some explanatory variable on dependent one so that effective answer can be
generated. Also, an effective relationship can be determined with the help of dependent and
independent variable, in this question, the variable like height and gender determine the
relationship between each other (Blackburn, 2021).
c) Interpretation
Through the above regression output, it has been determined that there is a strong relation
between the height and male because the value of P is lower than the standard criteria and that is
why, alternative hypothesis is accepted over other. This in turn reflected that if the height of
males changes, then there is 4m or cm changes identified within heights. That is why, it can be
stated that if there is any changes identified within males, then it reflected upon height of the
selected respondents as well. Thus, the output generated by regression table clearly reflected that
there is a strong association between the variables and this in turn cause a direct impact over the
results as well (Jing and et.al., 2021). Though the result reflected shows that male is a dummy
variable and that is why, due to lower significant difference between variables, null hypothesis is
rejected.
QUESTION 4
a) meaning and cause of autocorrelation
Autocorrelation mainly represent a degree of similarity between a time series and lagged
version over a successive time interval. Further, it also measures the relationship between a
variable current value as well as past values (Blazquez and et.al., 2018). Thus, it is also known as
serial correlation as well as serial dependence. There are different causes that lead to
autocorrelation which are as mentioned below:
 Time to adjust is the basic cause of autocorrelation because it mainly occurs in Macros
and reaching to new equilibrium might take some time to generate a better outcome.
 Prolonged influence is another cause that is also related to economic shocks which is
further associated with exchange rate (Miralha and Kim, 2018).
 Using function for smooth data also lead to autocorrelation into different disturbance
terms.
 Misspecification is another issue that lead to cause provide wrong outcome because
missing independent variable might affect the result in adverse manner.

 Wrong choice of features also affects the result in adverse manner because it might affect
the result and the input choose for the data might be affected negatively.
On the other side, there are different problems that presence of autocorrelation causes in
an OLS estimator, which are as mentioned below:
 Non-linearity is the problem which lead to cause OLS estimator. In this, the variable
might lead to cause very complicated system which in turn require weak assumption and
affect the results in negative manner as well.
 Having more than one variable is also lead to cause OLS estimator because having huge
data might lead to cause negative impact over results and that is why, opting too many
variables to determine results should be neglect (Uyanto, 2020).
 Wrong choice of error function is also lead to cause a negative impact over the results, so
opting right function for the regression help to identify best outcome and generate best
outcome for the same. That is why, it can be stated that due to opting a wrong function
might lead to affect the results negatively.
b) Drawbacks of Durbin Watson test
Durbin Watson test is mainly used to test for autocorrelation in a residual by using any
statistical model or regression analysis. The value might be ranging from 0 and 4 only, whereas 2
indicates that there is no autocorrelation in a sample (Stanley and Doucouliagos, 2017). Thus, the
assumption lay behind the test is related to null hypothesis where residuals are not linearly auto-
correlated. The major drawback with regard to Durbin Watson test are as mentioned below:
In addition to this, the major drawback identified of using Durbin Watson test is such that
it must not be applied to model which is already contain auto-regression effects. In addition to
this, the test is also not appropriate measure for autocorrelation, due to explanatory variables
which might lagged values of an endogenous variables (Mahaboob and et.al., 2019). Thus, it can
be stated that it is an inappropriate for testing higher order serial correlation that assist to lead
opposite outcome as well. Along with this, it can be stated that it is an inconclusive, if computed
value only lies between it. This cause adverse impact upon the results and derive opposite
answers as well. Thus, it can be reflected that as per the assumption defined under the test shows
that the residuals are mainly not linearly au-correlated, so that it might affect the results in

adverse manner. Along with this, it can be reflected that sometimes it leads to derive wrong
standard errors for a regression coefficient estimates so that adverse impact might be generated.
QUESTION 5
Major difference between fixed effects and random effects models
Fixed effect model is mainly used where parameter are fixed or in a non-random
quantities. On the other side, random effects models is known as where parameters are in random
variables. Thus, it can be stated that the fixed effect model is mainly remove omitted variable
bias by measure changes within a group across in time variation. This also include a dummy
variable that is used for missing or an unknown characteristic. In the context of random model
used in panel analysis of hierarchical where one assumes no fixed effect and it will estimate the
effect of time- invariant variables (Mahaboob and et.al., 2019). Thus, under random, there will
be no estimates over the biased because of having a proper controlling for an omitted variable.
Thus, removing a random effect might cause large drop in log-likelihood which means a
significant difference between variables. In addition to this, many application including the
econometrics and bio-statistics that a fixed model is refer to a regression model where fixed
variable are used.
The major difference identified between fixed effect model is such that it assumes the
individual specific effect that is also correlated to independent variable. On the other side,
random effect model further helps or allow to making inference on population data based on
different assumption of a normal distribution (Li, Meng and Du, 2020). On the other side,
another difference includes with pooling. Such that random effect is mainly estimated with a
partial pooling whereas fixed effect are not based upon this. Here, the partial pooling refers that
having new data point in a group which in turn effect the estimated value. Along with this, the
fixed effects are coefficients whereas random effects are variance of intercepts and slopes across
groups so that effective results can be drawn. Along with this, it can be analysed that there is an
estimation technique used under fixed effect that help to identify the difference between time
invariant which is also correlated to the observed independent variables.
Moreover, both fixed and random variable have difference to measure the outcomes and
that is why, it affect the changes negatively. For example, under any effect from being a women,
a person of color and 17 year person will never change over the time passes. On the other side,

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

for random effect the example shows that collecting data from a different medical centre which
might be consider as a random effect because data is collected randomly (Bell, Fairbrother and
Jones, 2019). Overall, it can be stated that these are used within a panels where the purpose of
using the data is different and this in turn cause a direct impact over the results. So, due to
change in variables, random effect experiment is also known as variance component model
where one assumes that there is no fixed effects identified within a data so that range of variables
can be determined. Therefore, it has been analysed through the overall discussion that fixed
effects are always constant across individual, however random effects vary and causes a direct
impact over the results.

REFERENCES
Books and Journals
Daoud, J. I., 2017, December. Multicollinearity and regression analysis. In Journal of Physics:
Conference Series (Vol. 949, No. 1, p. 012009). IOP Publishing.
Weaving, D. and et.al., 2019. Overcoming the problem of multicollinearity in sports performance
data: A novel application of partial least squares correlation analysis. PLoS One. 14(2).
p.e0211776.
Majid, A., Amin, M. and Akram, M. N., 2021. On the Liu estimation of bell regression model in
the presence of multicollinearity. Journal of Statistical Computation and Simulation, pp.1-
21.
Roh, H. J., 2020. Spatial Transferability Testing of Dummy Variable Winter Weather Model
Using Traffic Data Collected from Five Geographically Dispersed Weigh-in-Motion Sites
in Alberta Highway Systems. Journal of Transportation Engineering, Part A:
Systems. 146(11). p.04020128.
Blackburn, M. L., 2021. Testing for dummy-variable effects in semi-logarithmic
regressions. Applied Economics Letters, pp.1-5.
Jing, J. I. N. and et.al., 2021. Estimation on Forest Volume Based on ALS Data and Dummy
Variable Technology. FOREST RESOURCES WANAGEMENT. (1). p.77.
Blazquez, C.A. and et.al., 2018. Spatial autocorrelation analysis of cargo trucks on highway
crashes in Chile. Accident Analysis & Prevention. 120. pp.195-210.
Miralha, L. and Kim, D., 2018. Accounting for and predicting the influence of spatial
autocorrelation in water quality modeling. ISPRS International Journal of Geo-
Information. 7(2). p.64.
Uyanto, S. S., 2020. Power comparisons of five most commonly used autocorrelation
tests. Pakistan Journal of Statistics and Operation Research, pp.119-130.
Mahaboob, B. and et.al., 2019, December. On misspecification tests for stochastic linear
regression model. In AIP Conference Proceedings (Vol. 2177, No. 1, p. 020039). AIP
Publishing LLC.
Mohtasib, R.S. and et.al., 2019. Sonographic measurements for kidney length in normal Saudi
children: correlation with other body parameters. Annals of Saudi medicine, 39(3), pp.143-
154.
Li, T., Meng, Q. and Du, Q., 2020. Application of Random Effects to Explore the Gulf of
Mexico Coastal Forest Dynamics in Relation to Meteorological Factors. IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensing. 13. pp.5526-5535.
Bell, A., Fairbrother, M. and Jones, K., 2019. Fixed and random effects models: making an
informed choice. Quality & Quantity. 53(2). pp.1051-1074.

Stanley, T. D. and Doucouliagos, H., 2017. Neither fixed nor random: weighted least squares
meta‐regression. Research synthesis methods. 8(1). pp.19-42.