ECON 1274 Business Statistics Assessment Task 2 Project Report

Verified

Added on 2022/07/28

AI Summary

This report presents a comprehensive analysis of a business statistics assignment (Assessment Task 2) focusing on data analysis and statistical concepts. The assignment involves analyzing a dataset of house prices, including descriptive statistics like mean, median, skewness, and standard deviation. The analysis explores the distribution of house prices using histograms, frequency polygons, and box-whisker plots, identifying outliers and determining the appropriate measures of central tendency. The report also includes the calculation of a 95% confidence interval for the population mean house price. The report addresses the assumptions underlying the statistical analyses and the limitations of the data. The student has provided the solution to the business statistics assignment, which includes calculations and interpretations based on the given dataset. The solution covers topics such as descriptive statistics, confidence intervals, and the identification of outliers. The report is structured as a business report, including an executive summary, introduction, body, and conclusion, and also includes references. This solution is available on Desklib, a platform offering study resources for students.

BUSINESS STATISTICS
ASSESSMENT TASK 2
STUDENT ID:
[Pick the date]

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

PART A
Question 1
The requisite data for 50 houses has been collected for post code 3162 as shown below.
2

Question 2
The requisite descriptive statistics as obtained from Excel is summarised below.
Question 3
The description of the various variables based on the above descriptive statistics has been
carried out below.
Price
The skew of the price data is -0.10 which implies that the shape of the distribution would be
approximately symmetric. The mean price of the house is $1,517,680. However, this price is
significant different from the median price of $ 1,460,000. This implies that there are 25
houses in the given sample which have been sold at a price not exceeding $1,460,000. The
difference in mean and median value of the house implies that the underlying variable is not
normally distributed. The variability in price is captured by measures such as variance,
standard deviation and range. The minimum price of the house is $ 405,000 while the
maximum price of a house included in the given sample is $2,300,000. Comparing the
standard deviation of the given variable with the mean, it can be implied that price variation
is medium only and not very high (Hillier, 2016).
Size
The skew of the size data is -0.78 which implies that the shape of the distribution would be
asymmetric with a longer tail on the left side of the mean. This may indicate presence of
outliers on the lower side. The mean size of the house is 574.45 m2.. However, this size value
is significant different from the median house size of 637 m2. This implies that there are 25
houses in the given sample whose size does not exceed 637 m2.. The difference in mean and
median value of the house implies that the underlying variable is not normally distributed.
The variability in size is captured by measures such as variance, standard deviation and
range. The minimum size of the house is 10 m2 while the maximum size of a house included
3

in the given sample is 1088 m2. Comparing the standard deviation of the given variable with
the mean, it can be implied that size variation is very high. Also, the presence of significant
difference between mean and median implies that there are few outliers on the lower end of
the size variable (Flick, 2015).
Number of bedrooms
The skew of the bedrooms data is 1.89 which implies that the shape of the distribution would
be asymmetric with a longer tail on the right side of the mean. This may indicate presence of
outliers on the upper side. The mean number of bedrooms in the house is 3.34. However, this
size value is significant different from the median bedroom count of 3. This implies that there
are 25 houses in the given sample whose number of bedrooms does not exceed 3. The
difference in mean and median value of the number of bedrooms implies that the underlying
variable is not normally distributed. The variability in size is captured by measures such as
variance, standard deviation and range. The minimum number of bedrooms of the house is 2
while the maximum number of bedrooms in a house included in the given sample is 7.
Comparing the standard deviation of the given variable with the mean, it can be implied that
size variation is very high. Also, the presence of significant difference between mean and
median implies that there are few outliers on the upper end of the number of bedrooms
variable (Hair et. al, 2015).
Question 4
The requisite histogram of house prices based on the given sample data of 50 observations is
shown below.
Frequency polygon of house prices ($’000)
4

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Based on the above histogram and frequency polygon, it seems that the distribution of house
prices is approximately normal. The model class is found in the middle of the histogram and
there is a gradual decline on both sides of the mean value. The shape of the frequency
polygon is quite symmetric except towards the extreme values on both sides. Based on the
histogram and the frequency polygon, it does not seem that any outlier is present with regards
to house price (Medhi, 2016).
Question 5
The box-whisker plot for the variable house prices is as shown below.
5

The box plot for the house price highlights a symmetric distribution where the box seems to
have equidistant extremes on either end. Also, the box seems to be symmetrical which
implies that negligible skew is present in this variable. This has also been reflected from the
histogram (Hair et. al., 2015).
There are two outliers which are present on the lower end and have been marked by dots in
the box and whisker plot. The lower end of acceptable values is computed by subtracting 1.5
times the IQR from the first quartile. Similarly, the upper end of acceptable values is
computed by adding 1.5 times the IQR to the third quartile. Any value which lies outside this
interval would be categorised as an outlier (Taylor and Cihon, 2014).
Question 6
Since outliers are present in the house price data, hence the suitable measure of central
tendency for house price would be median and not mean. Mean would have been considered
a suitable measure of central tendency when the underlying variable is symmetrically
distributed. Median is preferred over mean since the mean value has the tendency to be
distorted by extreme values. This is not the case with median whose computation is
independent of the extreme values. As a result, when outliers are present, the median is taken
as the appropriate measure of central tendency (Hillier, 2016).
Question 7
The 95% confidence interval for population of house prices in the given postcode has been
computed based on the collected sample data of 50 houses in this area.
6

It can be concluded based on the above computation that the mean house price would fall
between $1397.91 and $1637.45 (thousands dollar). It is noteworthy that there is a 5% chance
that the mean house price for the population may not lie within the above interval
(Eriksson and Kovalainen, 2015).
There are a host of assumptions which have been taken in order to reach the above
estimation. Some of these are indicated below (Medhi, 2016).
 It has been assumed that the sample data has been collected using random sampling.
In reality this was not true since it was non-random sampling. As a result, it is quite
possible that the house data is not representative of the actual population.
 It has been assumed that the house prices are normally distributed. This is also not
absolutely true as there are two outliers present in the data as apparent from the stem
the leaf plot.
 It is assumed that the population standard deviation with regards to the house prices is
not known and hence t statistics has been used for computation instead of z statistics.
7

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

References
Eriksson, P. and Kovalainen, A. (2015) Quantitative methods in business research. 3rd ed.
London: Sage Publications.
Flick, U. (2015) Introducing research methodology: A beginner's guide to doing a research
project. 4th ed. New York: Sage Publications.
Hair, J. F., Wolfinbarger, M., Money, A. H., Samouel, P., and Page, M. J. (2015) Essentials
of business research methods. 2nd ed. New York: Routledge.
Hillier, F. (2016) Introduction to Operations Research.6th ed.New York: McGraw Hill
Publications.
Medhi, J. (2016) Statistical Methods: An Introductory Text. 4th ed. Sydney: New Age
International.
Taylor, K. J. and Cihon, C. (2014) Statistical Techniques for Data Analysis. 2nd ed.
Melbourne: CRC Press.
8