MITS6002 Assignment 3: Retail Insights and Regression Analysis

Verified

Added on 2022/08/20

AI Summary

This assignment, for the MITS6002 Business Analytics course, addresses key concepts in data analysis and business insights. The first part of the assignment involves a critical review of the 'CommBank Retail Business Insights Report FY18,' evaluating the quality of visualizations, presentability, and the information provided, followed by a summary of key findings and suggestions for improvement. The report highlights innovation trends among Australian retailers. The second part delves into regression analysis, providing an example of its application, collecting and analyzing height and weight data to compute the regression equation, calculate the R-squared value, and interpret the goodness of fit. The third part differentiates between classification and prediction methods, explores neural networks, and discusses the application of clustering in business analytics, with examples such as customer segmentation in grocery stores, banking, and car rental businesses. References are included to support the analysis.

MITS6002 Business Analytics
Assignment 3

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Question 1
i. Comment on the insights report based on the overall features; including the
quality of visualizations, presentability, and the information provided
a. Quality of Visualizations and presentability
On average, the report uses a number of visualizations that are relatively clear and includes
textual information for easy contrasting and application of high contrast colors which ensure
visibility hence the quality of the visualizations is fairly good and easy to present i.e. during
projection.
b. Information provided
The report includes a wide range of information on different aspects of innovation in Australia
all of which can be used to tell a story. Hence, the report is fairly informative.
ii. Key Information derived from the report
From the report, the key information is related to the innovative nature/distribution among
Australia retailers. This information can be used to make decisions on the amount of investment
that can be put into a firm’s investment endeavor depending on how innovative the firm ranks
itself.
iii. Summary of the report’s insights
In Australia, only 71% of multichannel retailers are active while 87% of the retailers are either
innovative active of improvers. On the other hand, the percentage of adopters has increased from
26.2% in 2016 to 32% in 2018 which indicates an upward trend among retailers. In relation to
the innovation index, retailers from VIC/TAS are more innovative (32.0) while those from WA
are the least innovative (24.9). in addition, 48% of the retailers tend to invest in sales and
marketing while 55% invest in their websites. Moreover, 80% of innovative retailers expect
returns on innovation (ROI) within 12 months.

iv. Suggested improvements to this insights report
Since the report gives information on different years, the report can use a trend curve to show the
changes in different aspects of innovation over the years. Further, the report uses almost the
same colors for all the visualizations. It is advisable that the visualizations should include various
colors though not overly so as to ensure attractiveness to the reader i.e. to keep the report reading
interesting and somewhat lively.
Question 2
i. Example of where regression analysis can be used effectively
The application of regression analysis can be applied in various fields some of which include
healthcare, banking, mining, engineering, business etcetera. For example, in business practices,
regression analysis can be used to analyze the effectiveness of a marketing strategy, the effect of
the prising of a product or the effect of promotions on sales. In the case of measuring the
effectiveness of a given marketing strategy on sales, the firm can conduct a regression analysis to
measure the relationship between the investment made on the marketing strategy and the return
on investment.
ii. Data containing information on height and weight of 10 individuals
Table 1: Height and Weight data
Height (Inches) Weight (
1 66.72 112.99
2 67.78 143.31
3 71.09 142.42
4 65.43 121.23
5 69.11 115.70
6 67.71 121.96
7 70.04 131.52

8 69.22 128.10
9 70.73 131.64
10 73.90 140.61
iii. Scatterplot based on the above data. Based on the plot comment on the
relationship between height and weight.
65 66 67 68 69 70 71 72 73 74 75
0
20
40
60
80
100
120
140
160
Weight (Pounds)
Heigth
Weights
Figure 1: Scatterplot illustrating the relationship between weights of the individuals and their heights
Based on the scatterplot above, we notice an increasing trend on the relationship between height
and weight such that, an increase in height tends to lead to an increase in weight.
iv. The equation of the regression line.
The equation of the regression line on the relationship between weight and height is obtained as
shown below:
Table 2: Statistics for use in regression and correlation analysis
Heig
ht
Weig
ht
Height×Wei
ght
Heigh
t2
Weigh
t2
66.72 112.9 7538.7 4451. 12767

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

9 6
67.78 143.3
1
9713.6 4594.
1
20538
71.09 142.4
2
10125 5053.
8
20283
65.43 121.2
3
7932.1 4281.
1
14697
69.11 115.7 7996 4776.
2
13386
67.71 121.9
6
8257.9 4584.
6
14874
70.04 131.5
2
9211.7 4905.
6
17298
69.22 128.1 8867.1 4791.
4
16410
70.73 131.6
4
9310.9 5002.
7
17329
73.9 140.6
1
10391 5461.
2
19771
Sum 691.7
3
1289.
5
89344 4790
2
16735
3
Now:
X = 1
n ∑
i=1
n
Xi= 691 .73
10 =69 . 173
Y = 1
n ∑
i=1
n
Yi=1289 . 48
10 =128 . 948

SSxx=∑
i=1
n
( Xi∗Xi ¿)− 1
n ¿ ¿
SSyy=∑
i=1
n
(Yi∗Yi¿)− 1
n ¿ ¿
SSxy=∑
i=1
n
XiYi− 1
n ¿
Therefore, based on our calculations above, the regression coefficients (the slope m, and the y-
intercept n) are obtained as follows:
m = SSxy
ssXX = 146 . 41885999999
53 . 309609999997 =2 .7466
n = X - Y * m = 128.948−69.173×2.7466=−61.0409
Therefore, we find that our regression equation for the relationship between height and weight is:
Weight=−61.0409 + 2.7466 (Height) -------------------------------------- (Equation 1)
From equation 1, it can be noted that an increase in an individual’s height from 60 inches to 61,
which leads to a subsequent increase in weight from approximately 103.755 to 106.5016.
v. The R2 value of the regression equation and comment on the goodness of the
fit.
The R-Squared value is given by:
∑X = 691.73 , ∑Y = 1289.48 , ∑X⋅Y = 89343.6189 , ∑X2 = 47902.3489 , ∑Y2 = 167352.7792

R2 = (
n . ∑ XY −∑ X .∑ Y
√ [ n∑ X 2− (∑ X ) 2 ] .[n ∑ Y 2−(∑ Y )2])2
R = 10∗89343.6189−691.73∗1289.48
√ [ 10∗47902.3489− ( 691.73∗691.73 ) ]∗[10∗167352.7792−(1289.48∗1289.48)] = 0.6111
R2 = 0.61112 ≈ 0.3734
The R-Squared value of the regression model is approximately 0.3734 as obtained above. This
figure implies that the regression model accounts for 37.34% of the variability in the data which
is relatively low leading to the conclusion that the model is not a good fit for the data.
vi. Use an analytics tool of your choice to calculate the values for iv, and v.
Compare them with your answer.
Using Excel we obtained the results given in figure 2 below:
65 66 67 68 69 70 71 72 73 74 75
0
20
40
60
80
100
120
140
160
f(x) = 2.74657533604166 x − 61.04085572001
R² = 0.373429184426089
Weight (Pounds)
Heigth
Weights
Figure 2: Scatterplot of the relationship between height and weight with regression equation and R-squared value
Based on the statistics shown in figure 2 above, we obtained the same value for both the
regression equation and R-squared as that obtained using the Excel analytics tool.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Question 3
i. Difference Between Classification and Prediction
According to (Brownlee, 2019), a classification problem is mainly concerned with predicting a
label of a given class while prediction problems/ assumably regression are mainly concerned
with predicting a continuous quantity.
ii. Examples of Classification Methods
(Shukla, 2017) lists examples of classification methods to include linear classifiers such as
logistic regression, Nearest Neighbors i.e. KNN, SVMs (Support Vector Machines), DT
(Decision Trees), Boosted Trees, Random Forest, and Neural Networks.
iii. The algebraic equation for y1 in terms of input values i1,i2 and weights w
Generally, y1 can be given by:
y = y(x,w) which can also be defined as shown below
f (b + ∑
i=1
n
xiwi where:
b = bias, x = input to the neuron, w = weights, n = the number of inputs from the incoming layer,
i = the counter from 0 to n.
Thus the equation of the neural network can be given by:
Or alternatively
f(b + ∑
i=1
2
xiw 1 w 2 w 3 w 4 w 5 w 6)

How neural networks are used for classification
In application, neural networks are designed to contain neurons (units) which are arranged in
layers that are used to convert an input vector to an output. Every unit accepts an input then
applies a function to the input which is passed onto the next layer. The networks are defined to
the feed-forward i.e. a unit feeds the output to the next but not vice versa. Further, weightings are
applied to the signals assigning from one neuron to another which are used are, “tuned in the
training phase to adapt a neural network to the particular problem at hand” i.e. the learning phase
(Bhadeshia, 2008).
iv. How clustering can be used in business analytics
The basic definition of clustering is that it is the, “…how clustering can be used in business
analytics” (Vohra, 2018). Ideally, in business analytics, clustering can be used to identify the
groups of customers for a business depending on the customer characteristics. Below are three
possible applications of clustering in business:
a. Identifying grocery groups
A grocery supermarket can use clustering to segment its customers into 5 groups depending on
their buying behavior to include those who buy fress foods, convenience junkies, etcetera.
b. In banking
Banks can use clustering to group credit owners into either defaulter or non-defaulters depending
on different characteristics.
c. Rental Business

A car rental owner might wish to understand customer preferences, the owner can use clustering
to group the tenants based on underlying characteristics hence be able to get a general
understanding of the possible preference of the customers.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

References
Bhadeshia, H. (2008). Neural Networks in Materials Science: Classification Network. In H.
Bhadeshia, Encyclopedia of Materials: Science and Technology (pp. 1-5). Amsterdam,
Netherlands: Elsevier.
Brownlee, J. (2019, May 22). Machine Learning Mastery. Retrieved from Machine Learning
Mastery: https://machinelearningmastery.com/classification-versus-regression-in-
machine-learning/
Shukla, S. (2017, December 12). Regression and Classification | Supervised Machine Learning.
Retrieved from Geeks for Geeks: https://www.geeksforgeeks.org/regression-
classification-supervised-machine-learning/
Vohra, G. (2018, November 30). Cluster Analysis For Business. Retrieved from Analytics
Training: https://analyticstraining.com/cluster-analysis-for-business/