Data Analysis: Exploring Relationships, Regression & Time Series

Verified

Added on  2023/03/30

|8
|679
|356
Homework Assignment
AI Summary
This assignment provides a detailed analysis of relationships between variables, regression analysis for house price prediction, and time series forecasting of Melbourne house prices. It uses contingency tables to explore the relationship between house prices and suburbs, scatter plots and correlation matrices to analyze variable relationships, and regression models to predict house prices based on house area, street appeal, weekly rent, and mountain views. The assignment also includes a time series analysis of Melbourne house prices, using a linear regression model to forecast future prices. The analysis identifies key variables influencing house prices and provides insights into the trends and patterns in the Melbourne housing market. Desklib offers a wide range of similar solved assignments and past papers to aid students in their studies.
Document Page
DATA ANALYSIS
STUDENT ID:
[Pick the date]
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Question 1- Relationships
(a) In order to explore if there is any relationship between House Price and Suburb, the
following contingency tables are useful.
From the above contingency table, it is apparent that there seems to be difference between house
price and suburb. This is evident from the fact that in the lowest price bracket suburb 1 has the
maximum properties at 43.5% while suburb 3 has the minimum properties at 17%. With regards
to the highest and the next highest price bracket, the maximum representation is seen from
suburb 3 properties while minimum representation is seen from suburb 1 properties.
(b) The relevant scatter plot for the various combinations of variables is indicated as follows.
Document Page
The above scatter plot indicates that there is neither linear nor non-linear relationship between
the given variables i.e. area of block of land and price of house.
The relationship between the two variables i.e. house area and price of house seems to be linear
considering that the deviation of the scatter points is not very large from the line of best fit.
Document Page
The relationship between the two variables i.e. weekly rent and price of house seems to be linear
considering that the deviation of the scatter points is not very large from the line of best fit.
(c) The requisite correlation matrix between the given variables is shown below.
From the above correlation matrix, it is evident that the independent variable which has the
strongest relationship with house price variable is weekly rent variable.
Question 2 – Regression Analysis
(a) The output for the regression model obtained from Excel is indicated as follows.
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
The regression equation is as follows.
House Price =377.37 + 1.92 House Area (Sqm)
House area has been given as 500 Sqm
Hence, house price = 377.37 + 1.92*500 = $1,339.13
There are concerns with regards to the above prediction. These arise primarily on the following
two counts.
The independent variables value range used for computing the above regression model
does not include the value 500 and hence the model can potentially be unreliable.
The R2 for the above value is only 0.31 which implies that house area is able to explain
only 31% of the variation in the house price. As a result, the predictive power of the
model is on the lower side leading to high deviation from the actual value.
(b) The correlation matrix is indicated below.
Document Page
The relevant multiple regression model with street, weekly rent and mountain view as the
independent variables has been obtained through the use of Excel for estimation of house price
as shown below.
Document Page
It is evident that R2 = 0.7728 which implies that 77.28% of the variation in house price is
explained jointly by the independent variables that have been considered for the given regression
model. This clearly implies that the model is a good fit as only a small proportion of the variation
remains unexplained by the existing independent variables. Further, the slope coefficients of all
the three independent variables selected are statistically significant as is apparent from the p
value of 0.000 for each of these. As a result, it can be concluded that the given regression model
does explain house prices.
Question 3 – Time Series
(a) The requisite time series for Melbourne house prices is shown below.
It is evident from the above scatter plot the median prices in Melbourne tend to broadly follow a
linear trend whereby the prices have been increasing in a steady manner. Although there have
been positive and negative deviations from the linear line of best fit as indicated above, but for
estimation, a linear regression model would be appropriate.
(b) The linear regression model as obtained from Excel is indicated below.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
The requisite relationship between median price and quarter is summarized below.
Price =237.405 + 6.328Quarter
Price (March 2017) = 237.405 + 6.328*61 = $623.4
Price (June 2017) = 237.405 + 6.328*62 = $629.7
Price (September 2017) = 237.405 + 6.328*63 = $636.0
Price (December 2017) = 237.405 + 6.328*64 = $642.4
chevron_up_icon
1 out of 8
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]