Business Statistics Analysis

Verified

Added on  2019/10/31

|11
|662
|430
Report
AI Summary
The assignment involves analyzing the mean, median, mode, range, and standard deviation for four types of businesses. The frequency and relative frequency distribution, as well as the relative frequency histogram, are also presented. It is observed that the distributions are not normal due to the presence of skewness. A hypothesis test is conducted to determine if there is a significant difference in the starting costs of businesses, with the result rejecting the null hypothesis and accepting the alternative. Additionally, a regression model is developed using annual net sales as the dependent variable and various independent variables. The results show that nearly 99% of the changes in annual sales can be explained by the respective changes in the independent variables, indicating a good fit for the linear regression model. Furthermore, a hypothesis test is conducted to determine if at least one of the slopes is not equal to zero, with the result rejecting the null hypothesis and accepting the alternative.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
STATISTICS
HI6007 GROUP ASSIGNMENT
STUDENT ID
[Pick the date]

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Task 1
(1) The value of mean, median, mode , range and standard deviation for the given four type
of business is shown below:
(2) The frequency and relative frequency distribution and the relative frequency histogram
for each of business type is highlighted below:
1
Document Page
0 to 30 31 to 60 61 to 90 91 to 120 121 to 150
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
Relative Frequency Histogram
X1 (Startup cost for Pizza)
Startup cost for pizza ($ 000's)
Relative frequency (%)
2
Document Page
0 to 30 31 to 60 61 to 90 91 to 120 121 to 150 151 to 180
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
40.00
Relative Frequency Histogram
X2 (startup costs for baker/donuts)
startup costs for baker/donuts($ 000's)
Relative frequency
0 to 30 31 to 60 61 to 90 91 to 120 121 to 150
0
5
10
15
20
25
30
35
40
45
Relative Frequency Histogram
X3 (startup costs for shoe stores)
Start up cost for shoe stores ($ 000's)
Relative frequency
3

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
0 to 30 31 to 60 61 to 90 91 to 120 121 to 150
0
10
20
30
40
50
60
Relative Frequency Histogram X4
(startup costs for gift shops)
startup costs for gift shops ($ 000's)
Relative frequency
4
Document Page
0 to 30 31 to 60 61 to 90 91 to 120
0
5
10
15
20
25
30
35
40
Relative Frequency Histogram X5
(startup costs for pet stores)
startup costs for pet stores ($ 000's)
Relative Frequency
(3) Based on the above histogram and relative frequency table, it is apparent that the
distribution for neither of the different business could be termed as normal as for each of
the businesses, the central tendency measures are not the same and also there is presence
of skew. Further, it is observed that for most of the businesses (except pet stores), the
start-up cost exceeds $ 30,000. The lowest set up costs seems to be for the pet stores
business.
(4) Hypothesis testing
Null hypothesis H0: There is insignificant difference present in the average starting costs of
business.
Alternative hypothesis H1: Atleast one of the business has a statistically different business
starting cost from the others.
The output from the ANOVA single factor is shown below:
5
Document Page
Assuming level of significance = 5%
It can be said that the value of F statistics is 3.25 and corresponding p value is 0.02.
Conclusion: The p value is lower than level of significance and hence, null hypothesis would be
rejected and alternative would be accepted. Hence, there is significant difference present in the
starting costs of business.
Task 2
(1) The regression model by taking x1 as depended variables and x2, x3, x4 and x5 as
independent variables is shown below:
6

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Regression equation
x 1=18.86+ ( 16.20x 2 ) + ( 0.17x 3 )+ ( 11.53x 4 )+ ( 13.58x 5 )+ (5.31x 6 )
Annual net sales
¿18.86+ ( 16.20number sq . ft ) + ( 0.17inventory ) + ( 11.53amount spent on advertizing ) + ( 13.58¿ sales district
(2) The value of R square is 0.9932 it is the indication that nearly 99.32% of the changes in
the annual sales can be explained by the respective changes in the independent variables.
Hence, it can be said that regression model is a good fit considering the high predictive
power of the linear regression model.
(3) Hypothesis testing
7
Document Page
Null hypothesis H0: All the slopes can be assumed to be zero.
Alternative hypothesis H1: Atleast one of the slopes is not equal to zero.
ANOVA table
Assuming level of significance = 5%
It can be said that the value of F statistics is 611.59 and corresponding p value is 0.
Conclusion: The p value is lower than level of significance and hence, null hypothesis would be
rejected and alternative hypothesis would be accepted. Hence, it may be concluded that at least
one of the dependent variable has a significant relationship with the independent variable.
(4) Interpretation of slope coefficient
X2- The slope coefficient of 16.20 indicates that when the area is increased by 1 square feet, the
corresponding annual sales would increase by $16.20.
X3- The slope coefficient of 0.17 indicates that when the inventory is increased by $1, the
corresponding annual sales would increase by $0.17.
X4- The slope coefficient of 11.53 indicates that when the advertising spend is increased by $1,
the corresponding annual sales would increase by $ 11.53.
X5- The slope coefficient of 13.58 indicates that when the size of sales district is increased by 1
family, the corresponding annual sales would increase by $ 13.58.
X6- The slope coefficient of -5.31 indicates that when the competitor is increased by 1, the
annual sales would decrease by $ 5,310.
(5) 95% confidence interval for the slope coefficient of individual variables
8
Document Page
(6) Hypothesis testing
H0: Slope coefficient is not significant.
H1: Slope coefficient is not significant
(7) All the variables are significant and hence, the regression model is shown below:
9

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
(8) Annual net sales =?
number sq . ft=1000
inventory=$ 150,000
amount spent on advertizing=$ 5000
¿ sales district=5000
number of competing stores district=2
Regression equation
Annual net sales
¿18.86+ ( 16.20number sq . ft ) + ( 0.17inventory ) + ( 11.53amount spent on advertizing ) + ( 13.58¿ sales district
Annual net sales ($ 000’s)
¿18.86+ ( 16.201 )+ ( 0.17150 )+ ( 11.535 )+ ( 13.585 ) + (5.312 )=138.448
10
1 out of 11
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]