Regression Analysis for House Market Value Estimation
Verified
Added on 2023/05/30
|11
|2540
|427
AI Summary
This report presents a regression analysis for estimating the market value of a house based on four independent variables. It includes scatter plots, multiple regression model, coefficient interpretation, significance testing, coefficient of determination, confidence interval, and a comparison of models.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
STATISTICS STUDENT ID: [Pick the date]
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
1) Introduction The key aim of the given report is to frame an appropriate regression model for the variables that have been presented. The total variable count for the given dataset is five and all these variables are in the form of quantitative data allowing the performing of regression analysis. As the data has been provided for each of the past 15 years, hence the sample size is 15. The primary objective is to develop a multiple regression model where the market price would be the dependent variable while the remaining four variables would serve as independent variables. The measurement scale for the different variables is ratio or interval so as to facilitate the representation of these variables in the form of a multiple regression model. The various variables provided seem suitable for estimation of market value of house. Once the multiple regression model is developed, then suitable changes would be made to develop a more suitable model and to weed out the independent variables which do not have a significant relationship with the dependent variable. 2) Scatter Plot Between every independent variable and the underlying dependent variable, scatter plot needs to be drawn which is carried out in this section. The requisite scatter plot between independent variable (Sydney price index) and dependent variable (market price) is as illustrated below. Considering that the best fit line shown in the plot above has a positive slope, hence it can be concluded that the underlying linear relationship between the given two variables is positive.
The deviation of the various scatter points from the line of best fit is also minimal which is indicative of the fact the underlying magnitude of the correlation between the variables is high. As a result, it would be fair to conclude that the given two variables (Sydney Price Index & Market price) have a positive and strong relationship in strength (Flick, 2015). The requisite scatter plot between independent variable (annual % change) and dependent variable (market price) is as illustrated below. Considering that the best fit line shown in the plot above has a positive slope, hence it can be concluded that the underlying linear relationship between the given two variables is positive. The deviation of the various scatter points from the line of best fit is quite large which is indicative of the fact the underlying magnitude of the correlation between the variables is low to moderate. As a result, it would be fair to conclude that the given two variables (Annual % change & Market price) have a positive but weak to moderate relationship in strength (Eriksson and Kovalainen, 2015). The requisite scatter plot between independent variable (Age of House) and dependent variable (market price) is as illustrated below. .
Considering that the best fit line shown in the plot above has a negative slope, hence it can be concluded that the underlying linear relationship between the given two variables is negative. The deviation of the various scatter points from the line of best fit is also not very large which is indicative of the fact the underlying magnitude of the correlation between the variables is moderately high. As a result, it would be fair to conclude that the given two variables (Age of house & Market price) have a negative and moderately strong relationship in strength (Hair, et al.,2015). The requisite scatter plot between independent variable (Area of House) and dependent variable (market price) is as illustrated below.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Considering that the best fit line shown in the plot above has a positive slope, hence it can be concluded that the underlying linear relationship between the given two variables is positive. The deviation of the various scatter points from the line of best fit is moderately large which is indicative of the fact the underlying magnitude of the correlation between the variables is moderate only. As a result, it would be fair to conclude that the given two variables (Area of house & Market price) have a positive but moderate relationship in strength (Hillier, 2016). 3) Multiple Regression Model The suitable multiple regression model has been obtained using Excel and the relevant output is illustrated below. 4) Equation & Coefficients The regression equation on the basis of the above output derived from Excel is highlighted below. Based on the regression equation indicated above, the intercept value is 548.98. The respective coefficients of the independent variables are the slope coefficients while the standard error for the model is 43.8878.
5) Coefficient interpretation and significance testing The coefficients indicated in the multiple regression model can be interpreted as highlighted below. Intercept – This particular coefficient indicates that house market value when the given independent variables all assume a value of zero which is ofcourse not practical. Slope coefficient (Sydney Price Index) – The given independent variable has a slope coefficient of 1.96. The interpretation of this coefficient is that when the given variable tends to alter by 1 unit, then the house market price would alter by $ 1,960. Considering the positive value of the coefficient, the movement of both the variables would be directed towards same direction (Fehr and Grossman, 2013). Slope coefficient (Annual % change) - The given independent variable has a slope coefficient of -5.62. The interpretation of this coefficient is that when the given variable tends to alter by 1 unit, then the house market price would alter by $ 5,620. Considering the negative value of the coefficient, the movement of both the variables would be directed towards opposite direction (Hastie, Tibshirani and Friedman, 2014). Slope coefficient (House Area) - The given independent variable has a slope coefficient of 0.52. The interpretation of this coefficient is that when the given variable tends to alter by 1 unit, then the house market price would alter by $ 520. Considering the positive value of the coefficient, the movement of both the variables would be directed towards same direction Slope coefficient (House Age) - The given independent variable has a slope coefficient of - 2.49. The interpretation of this coefficient is that when the given variable tends to alter by 1 unit, then the house market price would alter by $ 2,490. Considering the negative value of the coefficient, the movement of both the variables would be directed towards opposite direction (Fehr and Grossman, 2013). The statistical significance of the slope coefficients has been tested below. Sydney Price Index H0: βSydney Price Index= 0 i.e. the slope coefficient of the given variable is not significant and thereby can be taken as zero.
H1: βSydney Price Index≠ 0 i.e. the slope coefficient of the given variable is significant and thereby cannot be taken as zero. For the purpose of this hypothesis testing, the significance level is taken as 5%. The hypothesis testing would be carried out based on the t statistic. Taking the multiple regression result into consideration, t statistic is 3.37 and the underlying p value is 0.01. On comparison of the computed p value with the significance level, the lower values comes out as p value which warrants H0rejection based on the given evidence. Hence, H1would be accepted (Flick, 2015).The implication is that the slope coefficient is significant for the independent variable under consideration. Annual % Change H0: βAnnual%change= 0 i.e. the slope coefficient of the given variable is not significant and thereby can be taken as zero. H1: βAnnual%change≠ 0 i.e. the slope coefficient of the given variable is significant and thereby cannot be taken as zero. For the purpose of this hypothesis testing, the significance level is taken as 5%. The hypothesis testing would be carried out based on the t statistic. Taking the multiple regression result into consideration, t statistic is -1.74 and the underlying p value is 0.11. On comparison of the computed p value with the significance level, the lower values comes out as significance level which does not warrant H0rejection based on the given evidence. Hence, H1would not be accepted (Medhi, 2016). The implication is that the slope coefficient is not significant for the independent variable under consideration. Total Area H0: βTotalArea= 0 i.e. the slope coefficient of the given variable is not significant and thereby can be taken as zero. H1: βTotalArea≠ 0 i.e. the slope coefficient of the given variable is significant and thereby cannot be taken as zero. For the purpose of this hypothesis testing, the significance level is taken as 5%.
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
The hypothesis testing would be carried out based on the t statistic. Taking the multiple regression result into consideration, t statistic is 1.64 and the underlying p value is 0.14. On comparison of the computed p value with the significance level, the lower values comes out as significance level which does not warrant H0rejection based on the given evidence. Hence, H1would not be accepted (Hillier, 2016). The implication is that the slope coefficient is not significant for the independent variable under consideration. Age of House H0: βAgeofhouse= 0 i.e. the slope coefficient of the given variable is not significant and thereby can be taken as zero. H1: βAgeofhouse≠ 0 i.e. the slope coefficient of the given variable is significant and thereby cannot be taken as zero. For the purpose of this hypothesis testing, the significance level is taken as 5%. The hypothesis testing would be carried out based on the t statistic. Taking the multiple regression result into consideration, t statistic is -2.20 and the underlying p value is 0.052. On comparison of the computed p value with the significance level, the lower values comes out as significance level which does not warrant H0rejection based on the given evidence. Hence, H1would not be accepted (Hastie, Tibshirani and Friedman, 2014). The implication is that the slope coefficient is not significant for the independent variable under consideration. 6) Coefficient of Determination For the multiple regression model that has been developed, the R2value is 0.7906. This highlights the fact that the joint variation in the given independent variables can account for 79.06% of the changes that are witnessed with regards to the dependent variable i.e. house market price.As a result, there is about 21% of the dependent variable variation that is not accounted for by the given regression model. In such a scenario, it would be fair to conclude that the regression model is a good fit (Medhi, 2016). 7) Confidence interval Based on the output of the multiple regression in Excel, the 95% confidence interval has been identified for the respective parameters which have been highlighted as follows..
Theaboveconfidenceinterval,highlightthatthepopulationslopecoefficientofthe respective variables would be contained within the boundaries of the interval computed and this claim has a probability of being 95% correct. For example, the confidence interval with regards to age would represent that there is 95% likelihood that the slope coefficient of age based on the population would lie between -5.01 and 0.03 (Flick, 2015). 8) Revised regression model The revised regression model has been formed with house area as the only independent variable and house price being the dependent variable. The relevant output is illustrated as follows. Based on the above output, the estimated regression line equation is as highlighted below. 9) Models Comparison
For the multiple regression model, the coefficient of determination is 0.7906 while it is only 0.0981 for the revised simple regression model. As a result, it would be fair to conclude that the revised simple regression model is not a good fit model owing to the poor predictive capacity of accounting for only 9.81% of the alternation seem in house market prices (Hillier, 2016). Besides, in case of the revised regression model, taking into cognizance the t statistics associated with slope coefficient along with corresponding p value, it would be fair to conclude that the significance of the slope coefficient is not established (Eriksson and Kovalainen, 2015). Therefore, the conclusion can be drawn with regards to the superiority of the original multiple regression model on account of predictive capacity, better fit and significance of the model and atleast one slope coefficient. 10) Market Price Estimation Since the area of the building has been offered and no other information is given, hence price estimation needs to be carried out on the basis of the revised simple regression model whose underlying equation is referred as follows.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
References Eriksson, P. and Kovalainen, A. (2015)Quantitative methods in business research. 3rd ed. London: Sage Publications. Fehr, F. H. and Grossman, G. (2013).An introduction to sets, probability and hypothesis testing.3rd ed. Ohio: Heath. Flick, U. (2015)Introducing research methodology: A beginner's guide to doing a research project.4th ed. New York: Sage Publications. Hair, J. F., Wolfinbarger, M., Money, A. H., Samouel, P., and Page, M. J. (2015)Essentials of business research methods.2nd ed. New York: Routledge. Hastie, T., Tibshirani, R. and Friedman, J. (2014)The Elements of Statistical Learning.4th ed.New York: Springer Publications. Hillier, F. (2016)Introduction to Operations Research.6th ed.New York: McGraw Hill Publications. Medhi, J. (2016)Statistical Methods: An Introductory Text. 4th ed. Sydney: New Age International.